BANYAN. XI. The BANYAN Σ Multivariate Bayesian Algorithm to Identify Members of Young Associations with 150 pc

Jonathan Gagné; Eric E. Mamajek; Lison Malo; Adric Riedel; David Rodriguez; David Lafrenière; Jacqueline K. Faherty; Olivier Roy-Loubier; Laurent Pueyo; Annie C. Robin; René Doyon

doi:10.3847/1538-4357/aaae09

1. Introduction

Coeval associations of stars that formed from a single molecular cloud are valuable benchmarks to study how the properties of stars evolve with time (e.g., Zuckerman & Song 2004; Torres et al. 2008). While precisely measuring the age of an individual star is challenging, the simultaneous study of a large ensemble of stars can provide age measurements with precisions down to a few Myr (e.g., Bell et al. 2015). A small number of young associations in the solar neighborhood that were identified to date are of particular interest in part because their lower-mass members can be studied easily. They are also coeval populations, making them valuable for measuring the initial mass function and serving as age calibrators for members across all masses.

Recent surveys (e.g., Gagné et al. 2014, 2015c, 2015b; Kellogg et al. 2015; Aller et al. 2016; Faherty et al. 2016; Liu et al. 2016) have started uncovering the substellar and planetary-mass members of nearby young stellar associations, which will make it possible to understand how their fundamental properties evolve with time.

Because young associations of the solar neighborhood are sparse and span up to ∼20 pc in size, the distribution of their members can cover wide areas on the celestial sphere, and in some cases cover it almost entirely (e.g., the AB Doradus and β Pictoris moving groups; Zuckerman et al. 2001a, 2004). As a consequence, selection criteria based on sky coordinates and photometry alone are problematic.

As they formed recently and have not yet been perturbed significantly by other stars in the Galaxy, the members of a given young association still share similar space velocities UVW, with typical velocity dispersions below ∼3 km s⁻¹. This provides a way to identify the members of a young association; however, measuring their full kinematics requires not only proper motions, but also absolute radial velocities and parallaxes for every star. This constitutes the most challenging aspect in identifying their faint, low-mass members, as obtaining such measurements for a large sample of faint objects is prohibitive.

Various methods have been developed to identify members of young associations when only sky coordinates and proper motions are available. These include the convergent point tool (e.g., Mamajek 2005; Torres et al. 2006; Rodriguez et al. 2011), various goodness-of-fit metrics (e.g., Kraus et al. 2014; Bowler et al. 2017; Shkolnik et al. 2017), and the "good box" method (Zuckerman & Song 2004). Malo et al. (2013) developed BANYAN (Bayesian Analysis for Nearby Young AssociatioNs), an algorithm based on Bayesian inference where moving groups are modeled with unidimensional Gaussian distributions in Galactic coordinates XYZ and space velocities UVW. More complex algorithms, such as BANYAN II (Gagné et al. 2014) and LACEwING (Riedel et al. 2017b) have more recently been developed, where associations are modeled with freely rotating tridimensional Gaussian ellipsoids in position and velocity spaces. BANYAN I included distance constraints from field and young sequences in a M_J versus I_C − J color–magnitude diagram in the 2MASS (Skrutskie et al. 2006) and Cousins (see Malo et al. 2013 for details) systems, and were defined for spectral classes K and M. BANYAN II included similar constraints based on two color–magnitude diagrams (J − K_S versus M_W1 and H − W2 versus M_W1) in the 2MASS and WISE (Wright et al. 2010) systems, and were defined for spectral types later than M5.

These tools made it possible to identify hundreds of candidate members in nearby associations of stars, spanning the planetary to stellar-mass domains (e.g., Malo et al. 2013, 2014; Gagné et al. 2015b; Faherty et al. 2016; Liu et al. 2016). The majority of these classification tools include only the seven youngest (∼10–200 Myr) and nearest (≲100 pc) moving groups, with the exception of LACEwING, which includes 16 associations and open clusters with a larger age range (∼5–800 Myr).

The upcoming data releases of the Gaia mission (Gaia Collaboration et al. 2016b) will mark a new era in the study of young associations, as they will provide precise parallax measurements for a billion stars in the Galaxy, covering the full members of all associations within 150 pc down to late-M spectral types (Smart et al. 2017). This advancement in the census of members will improve kinematic models, lead to detailed measurements of initial mass functions, and open the door to the discovery of new sparse associations (e.g., see Oh et al. 2017). The first release of the Gaia mission (Gaia-DR1; Gaia Collaboration et al. 2016a) has already provided two million parallax measurements for stars in the Tycho-2 catalog (Høg et al. 2000), which have not yet been used to improve the membership classification tools described above.

This work presents BANYAN Σ, the next generation of the BANYAN tool based on Bayesian inference, which includes 27 associations with ages in the range ∼1–800 Myr, completing the sample of known and well-defined associations within ∼150 pc.⁹ BANYAN Σ includes a significantly improved Gaussian mixture model of the Galactic disk which captures a larger fraction of field interlopers, and updated models of young associations that benefit from the most recent Gaia-DR1 parallax measurements. The models are also advanced to six-dimensional (6D) multivariate Gaussians that capture full correlations in the XYZUVW distribution of members, including those in mixed spatial-kinematic coordinates. Two versions of the BANYAN Σ code (IDL¹⁰ and python) are made publicly available,¹¹ and a web portal is made available for single-object queries.¹²

Most previously available classification tools rely on time-consuming algorithms, such as numerical integrals, which make it challenging to analyze large data sets such as the upcoming full Gaia release, and use various approximations in converting observables and kinematic models to probabilities that affect their classification performance. In BANYAN Σ, most of these approximations are removed, and Bayesian marginalization integrals are solved analytically. As a consequence, the tool is ≈80,000 times faster than its predecessor BANYAN II, making it easier to analyze very large data sets. BANYAN Σ is also the first classification tool to include the Taurus, ρ Ophiuchi and Corona Australis star-forming regions (e.g., Wichmann et al. 2000; Reipurth 2008), and the nearest OB association Sco-Cen, composed of the three subgroups: Upper Scorpius, Lower-Centaurus Crux, and Upper Centaurus Lupus (e.g., Blaauw 1946; de Zeeuw et al. 1999; Pecaut & Mamajek 2016). It is also the first such tool to include the IC 2602 (e.g., Whiteoak 1961; Mermilliod et al. 2009), IC 2391 (e.g., Platais et al. 2007), and Platais 8 (Platais et al. 1998) clusters, and one of the first to include the Pleiades cluster. For the last of these, Sarro et al. (2014) presented a multivariate Gaussian mixture model to assign membership probabilities based on kinematic and photometric observables. Their model uses a larger number of free parameters, made possible by the large number of known Pleiades members, but it does not include other young associations.

In Section 2, the framework of BANYAN II is described, which serves as a starting point for BANYAN Σ, described in detail in Section 3. An updated list of bona fide members for 27 young associations within 150 pc is presented in Section 4, and is used to build the multivariate Gaussian kinematic models of BANYAN Σ in Section 5. Section 6 presents a multivariate Gaussian mixture model of field stars based on the Besançon Galactic model. A choice of Bayesian priors that ensures fixed recovery rates in all associations when using a P = 90% Bayesian probability threshold is described in Section 7. A performance analysis of BANYAN Σ is presented in Section 8, and is compared to other tools available in the literature. The membership of stars previously considered as ambiguous are revisited in Section 9 using BANYAN Σ. This work is concluded in Section 10.

2. The Banyan II Algorithm

In this section, the Bayesian framework behind the BANYAN II tool (Gagné et al. 2014) is described. The framework of BANYAN Σ will start from the same principles, but will include several improvements that are described in Section 3.

BANYAN II is a Bayesian classification algorithm, which uses the direct kinematic observables of a star {O_i}, namely its sky position (α, δ), proper motion (μ_α, μ_δ), radial velocity (ν) and distance (ϖ), to determine the probability that the star belongs to a population described by a hypothesis H_k, corresponding to either the Galactic field or one of several young associations. This is done by applying Bayes' theorem:

$\begin{eqnarray}&&P({H}_{k}| \{{O}_{i}\})=\displaystyle \frac{P({H}_{k})P(\{{O}_{i}\}| {H}_{k})}{P(\{{O}_{i}\})},\end{eqnarray} \tag{ 1 }$

where the likelihood $P(\{{O}_{i}\}| {H}_{k})$ is the probability that a member of H_k displays the observables {O_i}, the prior P(H_k) is the probability that a star belongs to hypothesis H_k irrespective of its kinematic properties, and the fully marginalized likelihood P({O_i}) is the probability that a star displays observables {O_i} irrespective of its membership. Once the priors and likelihoods are determined, the fully marginalized likelihood can be obtained with:

$\begin{eqnarray}&&P(\{{O}_{i}\})=\displaystyle \sum _{k}P(\{{O}_{i}\}| {H}_{k})P({H}_{k}).\end{eqnarray} \tag{ 2 }$

It is often the case that the radial velocity (ν) and/or distance (ϖ) of a star are not known, preventing a direct calculation of the likelihood $P(\{{O}_{i}\}| {H}_{k})$ . The case where both measurements are missing will be considered here. In this scenario, a likelihood probability can still be obtained by marginalizing over both missing parameters:

$\begin{eqnarray}&&P(\{{O}_{i}\}| {H}_{k})={\int }_{-\infty }^{\infty }{\int }_{0}^{\infty }{{ \mathcal P }}_{o}(\{{O}_{i}\}| {H}_{k})d\varpi \,d\nu ,\end{eqnarray} \tag{ 3 }$

where the symbol ${{ \mathcal P }}_{o}$ is used to distinguish probability densities from probabilities P that are free of physical units.

Gagné et al. (2014) demonstrated that young associations can be well described with Gaussian distributions by working in the Galactic position XYZ and space velocity UVW frame of reference {Q_i}. The distributions of direct observables {O_i} for the members of a young association would be accurately described only by complex functions in this coordinate frame. For this reason, BANYAN II approximated the likelihood by computing it directly in the {Q_i} parameter space:

$\begin{eqnarray}&&P(\{{O}_{i}\}| {H}_{k})={\int }_{-\infty }^{\infty }{\int }_{0}^{\infty }{{ \mathcal P }}_{o}(\{{Q}_{j}({\{{O}_{i}\}}^{{\prime} },\nu ,\varpi )\}| {H}_{k})d\varpi \,d\nu ,\end{eqnarray} \tag{ 4 }$

where ${\{{O}_{i}\}}^{{\prime} }$ represents the set of observables excluding the radial velocity ν and distance ϖ. Equation (4) inherently ignores the Jacobian of the transformation $\{{O}_{i}\}\to \{{Q}_{i}\}$ , discussed further in Section 3. In Gagné et al. (2014), both integrals were solved numerically on a uniform grid of 500 × 500 points over ν and ϖ, covering −35 to 35 km s⁻¹ and 0.1 to 200 pc, respectively. On each point (ν, ϖ) of the grid, the observables ${\{{O}_{i}\}}^{{\prime} }$ were transformed to the {Q_i} frame of reference, and compared with a model of hypothesis H_k to derive the probability density ${{ \mathcal P }}_{o}(\{{Q}_{j}({\{{O}_{i}\}}^{{\prime} },\nu ,\varpi )\}| {H}_{k})$ . The sum of all probability densities on the grid were then taken as an approximation of the likelihood $P(\{{O}_{i}\}| {H}_{k})$ . These approximations were mostly limiting in that they prevented the inclusion of high-velocity ( $| \nu | \gt 35\,\mathrm{km}\,{{\rm{s}}}^{-1}$ ) or distant (ϖ > 200 pc) stars, but they also required 250,000 probability density calculations for each star.

As measurement errors on the sky position of a star are always small enough to have a negligible contribution to XYZUVW compared to those on proper motion, radial velocity or distance measurements, they were ignored completely in BANYAN II, and will still be ignored in BANYAN Σ. Properly including the measurement errors on proper motion would require the introduction of two additional integrals to obtain a modified likelihood that takes error bars into account:

$\begin{eqnarray*}&&{P}_{e}(\{{O}_{i}\}| {H}_{k})={\int }_{-\infty }^{\infty }{\int }_{-\infty }^{\infty }P(\{{O}_{i}\}| {H}_{k}){{ \mathcal P }}_{{\rm{m}}}({\mu }_{\alpha },{\mu }_{\delta })d{\mu }_{\alpha }\,d{\mu }_{\delta },\end{eqnarray*}$

where ${{ \mathcal P }}_{{\rm{m}}}({\mu }_{\alpha },{\mu }_{\delta })$ is a probability density function describing the proper motion measurement, such as the product of two Gaussian distributions centered on the measured values with the appropriate characteristic widths. Because numerically solving this likelihood would be impractical, BANYAN II approximated their effect by using a propagation error formula to obtain error bars on U, V, and W, and added them in quadrature to the characteristic widths of the Gaussian models describing each hypothesis H_k.

The kinematic models of BANYAN II are based on Gaussian ellipsoids in XYZ and UVW space, which are freely rotated along any axis. Models for seven young associations were included: TW Hya (TWA; de La Reza et al. 1989; Kastner et al. 1997), β Pictoris (βPMG; Zuckerman et al. 2001a), Tucana-Horologium (THA; Torres et al. 2000; Zuckerman et al. 2001b), Carina (CAR; Torres et al. 2008), Columba (COL; Torres et al. 2008), Argus (ARG; Makarov & Urban 2000) and AB Doradus (ABDMG; Zuckerman et al. 2004). The model of field stars was obtained by fitting a spatial and a kinematic Gaussian ellipsoid to the Besançon Galactic model within 200 pc.

3. Banyan Σ: An Improved Algorithm

The framework of BANYAN Σ improves on BANYAN II by (1) using the analytical solution to the marginalization integrals over radial velocity and distance, (2) using multivariate Gaussian models for the young associations and a mixture of multivariate Gaussians to model the Galactic field, (3) removing several approximations in the calculation of the Bayesian likelihood, (4) accounting for parallax motion, and (5) including a larger number of young associations. The algorithm of BANYAN Σ is described in this section, and additional improvements with respect to the kinematic models are described in Sections 4–6. The models of BANYAN Σ are built from a "training set" consisting of a set of bona fide or high-likelihood members of young associations compiled from the literature. The models are therefore not built statistically, and the algorithm of BANYAN Σ is analogous to a Bayesian classification algorithm with a Gaussian mixture model (e.g., see Bishop & Nasrabadi 2007 and McLachlan & Peel 2000).

The multivariate Gaussian models of BANYAN Σ are described in Section 3.1, and the change of coordinates from the direct observables {O_i} to the Galactic frame of reference {Q_i} is described in Section 3.2. The analytical solution of the Bayesian likelihood is presented in Section 3.3, and determination of the radial velocity ν and distance ϖ that maximize the Bayesian likelihood (Equation (3)) are developed in Section 3.4. Section 3.5 presents a method to approximate the effect of measurement errors on proper motion. Section 3.6 details a new improvement to BANYAN Σ, allowing it to apply a parallax motion correction when the distance of a star is not known and its proper motion measurement is based on two epochs only. Sections 3.7–3.9 describe additional options to the BANYAN Σ algorithm, to include measurements of radial velocity and/or distance, constraints from spectrophotometric observables, or to ignore the spatial distribution of young associations.

3.1. Kinematic Models

The distribution of stars in young stellar associations and the Galactic field are modeled with multivariate Gaussian distributions in XYZUVW space. This is a generalization over the freely rotating individual XYZ and UVW Gaussian models used in both BANYAN II (Gagné et al. 2014) and LACEwING (Riedel et al. 2017b), as it models correlations between mixed spatial and kinematic coordinates. The kinematic model corresponding to hypothesis H_k can be written as:

$\begin{eqnarray}{{ \mathcal P }}_{M}(\bar{Q},\bar{\tau },\bar{\bar{{\rm{\Sigma }}}}) & = & \displaystyle \frac{{e}^{-\tfrac{1}{2}{{ \mathcal M }}^{2}}}{\sqrt{{(2\pi )}^{6}| \bar{\bar{{\rm{\Sigma }}}}| }},\\ {\rm{where}}\,{ \mathcal M } & = & \sqrt{{(\bar{Q}-\bar{\tau })}^{T}{\bar{\bar{{\rm{\Sigma }}}}}^{-1}(\bar{Q}-\bar{\tau })}\end{eqnarray} \tag{ 5 }$

is the Mahalanobis distance and the bar (e.g., $\bar{v}$ ) and double-bar (e.g., $\bar{\bar{M}}$ ) symbols are used to indicate vectors and matrices, respectively, in XYZUVW space. $\bar{Q}$ is a 6D vector built from {Q_i}, which correspond to the XYZUVW coordinates of an object. The Mahalanobis distance is a generalization of the concept of measuring how many standard deviations a data point is from the center of a Gaussian distribution, and is applied to multivariate Gaussian distributions in the present work. A Mahalanobis distance has no units and accounts for correlations in the multivariate Gaussian probability density function (Mahalanobis 1936).

The multivariate Gaussian model includes a total of 27 free parameters: six are stored in the $\bar{\tau }$ vector and indicate the center of the association; six are stored in the diagonal of the covariance matrix $\bar{\bar{{\rm{\Sigma }}}}$ and indicate the 6D size of the association; and 15 more are stored in the independent off-diagonal elements of $\bar{\bar{{\rm{\Sigma }}}}$ and indicate the orientation of the ellipsoid in 6D space or, equivalently, the correlations between each combination of coordinates. ${\bar{x}}^{T}$ indicates a vector transposition and ${\bar{\bar{s}}}^{-1}$ a matrix inverse.

The Bayesian likelihoods in the Galactic frame of reference {Q_i} can thus be written as:

$\begin{eqnarray*}&&{{ \mathcal P }}_{q}(\{{Q}_{i}\}| {H}_{k})={{ \mathcal P }}_{M}(\bar{Q},\bar{\tau },\bar{\bar{{\rm{\Sigma }}}}),\end{eqnarray*}$

where the dependencies of $\bar{\tau }$ and $\bar{\bar{{\rm{\Sigma }}}}$ on the association index k are implicit.

A simple approach for obtaining the parameters of a kinematic model is to calculate the average position $\bar{\tau }$ of the members in XYZUVW space and their variances and covariances to build the $\bar{\bar{{\rm{\Sigma }}}}$ matrix. A method that is more robust to outliers, and accounts for individual measurement errors, is presented in Section 5.

3.2. Change of Coordinates

Solving the Bayesian likelihood in Equation (3) requires applying a change of coordinates from the observables frame of reference {O_i} to the Galactic frame of reference {Q_i}. The equations for this transformation are detailed by Johnson & Soderblom (1987), where the components of ${\bar{\bar{Q}}}_{i}$ can be written as a linear combination of the radial velocity ν and the distance ϖ:

$\begin{eqnarray}&&\bar{Q}=\bar{{\rm{\Omega }}}\nu +\bar{{\rm{\Gamma }}}\varpi ;\end{eqnarray} \tag{ 6 }$

the components of $\bar{{\rm{\Omega }}}$ and $\bar{{\rm{\Gamma }}}$ are:

$\begin{eqnarray*}\bar{{\rm{\Omega }}} & = & (0,0,0,{M}_{0},{M}_{1},{M}_{2})=(0,{\boldsymbol{M}}),\\ \bar{{\rm{\Gamma }}} & = & ({\lambda }_{0},{\lambda }_{1},{\lambda }_{2},{N}_{0},{N}_{1},{N}_{2})=({\boldsymbol{\lambda }},{\boldsymbol{N}}),\end{eqnarray*}$

and symbols in bold represent 3D vectors or matrices in XYZ or UVW space.

The vectors ${\boldsymbol{\lambda }}$ , ${\boldsymbol{M}}$ and ${\boldsymbol{N}}$ transform the sky position, proper motion, radial velocity, and distance to XYZUVW following:

$\begin{eqnarray}{\boldsymbol{\lambda }} & = & (\cos b\cos l,\cos b\sin l,\sin b),\\ {\boldsymbol{M}} & = & {\boldsymbol{ \mathcal T }}{\boldsymbol{ \mathcal A }}\,{\boldsymbol{m}},\\ {\boldsymbol{N}} & = & {\boldsymbol{ \mathcal T }}{\boldsymbol{ \mathcal A }}\,{\boldsymbol{n}},\\ {\boldsymbol{ \mathcal A }} & = & \left[\begin{array}{ccc}\cos \alpha \cos \delta & -\sin \alpha & -\cos \alpha \sin \delta \\ \sin \alpha \cos \delta & \cos \alpha & -\sin \alpha \sin \delta \\ \sin \delta & 0 & \cos \delta \end{array}\right],\\ {\boldsymbol{m}} & = & (1,0,0),\\ {\boldsymbol{n}} & = & \kappa (0,{\mu }_{\alpha }\cos \delta ,{\mu }_{\delta }),\end{eqnarray} \tag{ 7 }$

where l and b are the Galactic longitude and latitude, α and δ are the R.A. and decl., ${\mu }_{\alpha }\cos \delta$ and μ_δ are the proper motion, and κ ≈ 4.74 · 10⁻³ corresponds to 10⁻³ au yr⁻¹ so that proper motions are expressed in mas yr⁻¹.

The ${\boldsymbol{ \mathcal T }}$ matrix is a combination of rotation matrices involving the equatorial position of the North Galactic Pole, and is detailed by Johnson & Soderblom (1987). We however use a definition of ${\boldsymbol{ \mathcal T }}$ where the first row has the opposite sign of that defined by Johnson & Soderblom (1987), so that U points toward the Galactic center and UVW forms a right-handed system:¹³

$\begin{eqnarray*}{\boldsymbol{ \mathcal T }}=\left[\begin{array}{ccc}-0.054875560 & -0.87343709 & -0.48383502\\ 0.49410943 & -0.44482963 & 0.74698224\\ -0.86766615 & -0.19807637 & 0.45598378\end{array}\right].\end{eqnarray*}$

Equation (6) can be used to express ${{ \mathcal P }}_{q}$ as a function of the observables {O_i}, but solving the marginalization integrals of Equation (3) requires applying a change of coordinates (from the {Q_i} frame to the {O_i} frame) to the probability density function itself ${{ \mathcal P }}_{q}\to {{ \mathcal P }}_{o}$ . This step is detailed in Appendix A, and yields:

$\begin{eqnarray*}&&{{ \mathcal P }}_{o}(\{{O}_{i}\}| {H}_{k})={\varpi }^{4}\,{{ \mathcal P }}_{q}(\{{Q}_{i}\}| {H}_{k}).\end{eqnarray*}$

Inserting the coordinate transformation (Equation (6)) into the kinematic model defined in Equation (5) yields:

$\begin{eqnarray}{{ \mathcal M }}_{k}^{2}(\nu ,\varpi ) & = & \displaystyle \sum _{{ij}}[{\sigma }_{{ij},k}^{-1}{{\rm{\Omega }}}_{i}{{\rm{\Omega }}}_{j}\,{\nu }^{2}+{\sigma }_{{ij},k}^{-1}{{\rm{\Gamma }}}_{i}{{\rm{\Gamma }}}_{j}\,{\varpi }^{2}\\ & & +\,{\sigma }_{{ij},k}^{-1}({{\rm{\Omega }}}_{i}{{\rm{\Gamma }}}_{j}+{{\rm{\Gamma }}}_{i}{{\rm{\Omega }}}_{j})\nu \varpi \\ & & -\,{\sigma }_{{ij},k}^{-1}({{\rm{\Omega }}}_{i}{\tau }_{j,k}+{\tau }_{i,k}{{\rm{\Omega }}}_{i})\nu \\ & & -\,{\sigma }_{{ij},k}^{-1}({{\rm{\Gamma }}}_{i}{\tau }_{j,k}+{\tau }_{i,j}{{\rm{\Gamma }}}_{j})\varpi \\ & & +\,{\sigma }_{{ij},k}^{-1}{\tau }_{i,k}{\tau }_{j,k}].\end{eqnarray} \tag{ 8 }$

All terms in Ω_i, Γ_i and τ_i,k can be described as scalar products induced by the inverse covariance matrix ${\sigma }_{{ij},k}^{-1}$ , such as:

$\begin{eqnarray}&&{\langle \bar{X},\bar{Y}\rangle }_{k}=\displaystyle \sum _{{ij}}{\sigma }_{{ij},k}^{-1}{X}_{i}{Y}_{j},\end{eqnarray} \tag{ 9 }$

where the k index will be omitted in the remainder of this work for simplicity.

Equation (8) can be further simplified from the fact that the covariance matrix is symmetric by definition. With these two simplifications we get:

$\begin{eqnarray}{{ \mathcal M }}^{2}(\nu ,\varpi ) & = & \langle \bar{{\rm{\Omega }}},\bar{{\rm{\Omega }}}\rangle {\nu }^{2}+\langle \bar{{\rm{\Gamma }}},\bar{{\rm{\Gamma }}}\rangle {\varpi }^{2}+2\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Gamma }}}\rangle \nu \varpi \\ & & -\,2\langle \bar{{\rm{\Omega }}},\bar{\tau }\rangle \nu -2\langle \bar{{\rm{\Gamma }}},\bar{\tau }\rangle \varpi +\langle \bar{\tau },\bar{\tau }\rangle .\end{eqnarray} \tag{ 10 }$

This completes the coordinate change of the Bayesian likelihood:

$\begin{eqnarray}&&{{ \mathcal P }}_{o}(\{{O}_{i}\}| H)d\nu \,d\varpi =\displaystyle \frac{{\varpi }^{4}{e}^{-\tfrac{1}{2}{{ \mathcal M }}^{2}(\nu ,\varpi )}}{\sqrt{{(2\pi )}^{6}| \bar{\bar{{\rm{\Sigma }}}}| }}\,d\nu \,d\varpi .\end{eqnarray} \tag{ 11 }$

3.3. Solving the Marginalization Integrals

A complete analytical solution to Equation (3) is developed in Appendix B and yields:

$\begin{eqnarray}&&{ \mathcal P }(\{{O}_{i}\}| H)=\displaystyle \frac{{{ \mathcal D }}_{-5}^{{\prime} }(\gamma /\sqrt{2\beta }){e}^{{\gamma }^{2}/4\beta -\zeta }}{| \bar{{\rm{\Omega }}}| \sqrt{{\pi }^{5}{\beta }^{5}| \bar{\bar{{\rm{\Sigma }}}}| }},\end{eqnarray} \tag{ 12 }$

$\begin{eqnarray}&&{\rm{with}}\ \beta =\displaystyle \frac{\langle \bar{{\rm{\Gamma }}},\bar{{\rm{\Gamma }}}\rangle }{2}-\displaystyle \frac{1}{2}\displaystyle \frac{{\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Gamma }}}\rangle }^{2}}{\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Omega }}}\rangle },\end{eqnarray} \tag{ 13 }$

$\begin{eqnarray}&&\gamma =\displaystyle \frac{\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Gamma }}}\rangle \langle \bar{{\rm{\Omega }}},\bar{\tau }\rangle }{\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Omega }}}\rangle }-\langle \bar{{\rm{\Gamma }}},\bar{\tau }\rangle ,\end{eqnarray} \tag{ 14 }$

$\begin{eqnarray}&&\zeta =\displaystyle \frac{\langle \bar{\tau },\bar{\tau }\rangle }{2}-\displaystyle \frac{1}{2}\displaystyle \frac{{\langle \bar{{\rm{\Omega }}},\bar{\tau }\rangle }^{2}}{\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Omega }}}\rangle },\end{eqnarray} \tag{ 15 }$

$\begin{eqnarray*}{{ \mathcal D }}_{-5}^{{\prime} }(x) & = & \sqrt{\displaystyle \frac{\pi }{2}}({x}^{4}+6{x}^{2}+3)\mathrm{erfc}\left(\tfrac{x}{\sqrt{2}}\right)\\ & & -\,({x}^{3}+5x){e}^{-{x}^{2}/2},\end{eqnarray*}$

where ${{ \mathcal D }}_{-5}^{{\prime} }(x)$ is a parabolic cylinder function (Magnus & Oberhettinger 1948) modified for numerical stability.

A limitation of this development is that correlations between the measurement errors of the sky coordinates, proper motion and parallax cannot be accounted for. Properly accounting for such correlations would require performing much more CPU-intensive and less precise numerical integrals. However, we demonstrate in Section 8 that ignoring such correlations cause negligible biases in the Bayesian membership probabilities.

In summary, obtaining the Bayesian probability for a given star and hypothesis H requires calculating the components of the vectors $\bar{{\rm{\Omega }}}$ , $\bar{{\rm{\Gamma }}}$ , $\bar{\tau }$ (Equations (6) and (7)) and their various scalar products (Equations (13)–(15)) using Equation (9), then evaluating the non-marginalized Bayesian likelihood with Equation (12), and finally evaluating the Bayesian membership probability with Equation (1), which makes use of Equation (2). The priors for each hypothesis are also needed to evaluate Equation (12); these are defined in Section 7. When a large number of stars is analyzed, it is possible to improve the efficiency in solving Equation (12) by calculating the individual six components of the 6D vectors, as well as β, γ, ζ and ${ \mathcal P }(\{{O}_{i}\}| H)$ for the full array of stars at once.

3.4. Optimal Radial Velocity and Distance

The optimal values for the radial velocity and distance (ν_o, ϖ_o) that maximize the non-marginalized Bayesian likelihood ${{ \mathcal P }}_{o}(\{{O}_{i}\}| H)$ can be determined for each hypothesis H. They correspond to predictions for the radial velocity and distance of a star, assuming that the star is a true member of H (e.g., Liu et al. 2016 obtain BANYAN II distance predictions with accuracies as low as ∼20% by ignoring this fact).

The optimal radial velocity and distance are derived in Appendix C:

$\begin{eqnarray*}{\varpi }_{{\rm{o}}} & = & \displaystyle \frac{-\gamma +\sqrt{{\gamma }^{2}+32\beta }}{4\beta },\\ {\nu }_{{\rm{o}}} & = & \displaystyle \frac{4+\langle \bar{{\rm{\Gamma }}},\bar{\tau }\rangle {\varpi }_{{\rm{o}}}-\langle \bar{{\rm{\Gamma }}},\bar{{\rm{\Gamma }}}\rangle {\varpi }_{{\rm{o}}}^{2}}{\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Gamma }}}\rangle {\varpi }_{{\rm{o}}}},\\ {\sigma }_{\varpi } & = & | \bar{{\rm{\Gamma }}}{| }^{-1},\\ {\sigma }_{\nu } & = & | \bar{{\rm{\Omega }}}{| }^{-1},\end{eqnarray*}$

where σ_ϖ and σ_ν represent statistical 1σ error bars on the optimal values. These two values are given for each hypothesis and each star as an output of BANYAN Σ, and can be taken as predictions on the radial velocity and distance measurements that would maximize the probability of a given hypothesis.

Because the optimal radial velocity and distance are intrinsically linked to the assumption that a star is a member of hypothesis H, adopting the values (ν_o, ϖ_o) for a low-probability hypothesis H inherently carries a large risk of the true radial velocity and distance of a star of being several σ away from the prediction. The values of σ_ϖ and σ_ν will therefore be unreliable in such a situation.

3.5. Approximating the Effect of Proper Motion Measurement Errors

As discussed in Section 2, including the effect of proper motion errors requires solving two additional marginalization integrals on the proper motion components, in addition to those on radial velocity and distance. Since obtaining an analytical solution to these four marginalization integrals is impractical, this section describes an analytical approximation for the effect of the proper motion measurement error.

The optimal radial velocity ν_o and distance ϖ_o that maximize the non-marginalized Bayesian likelihood ${{ \mathcal P }}_{o}(\{{O}_{i}\}| H)$ can be used to propagate the proper motion measurement errors to the galactic position XYZ and space velocity UVW in the vicinity of the maximum of the membership probability distribution function. The sky position, proper motion, ν_o, ϖ_o and proper motion errors are propagated in XYZUVW space to obtain a Galactic error vector ${\bar{\sigma }}_{q}=({\sigma }_{X},{\sigma }_{Y},{\sigma }_{Z},{\sigma }_{U},{\sigma }_{V},{\sigma }_{W})$ .

It is then possible to add these errors in quadrature to the diagonal elements of $\bar{\bar{{\rm{\Sigma }}}}$ without affecting the orientation of the multivariate Gaussian with:

$\begin{eqnarray*}{\bar{\bar{{\rm{\Sigma }}}}}^{{\prime} } & = & \bar{\bar{G}}\,\bar{\bar{{\rm{\Sigma }}}}\,\bar{\bar{G}},\\ \bar{\bar{G}} & = & {\mathrm{diag}}_{+}\sqrt{\displaystyle \frac{{\mathrm{diag}}_{-}(\bar{\bar{{\rm{\Sigma }}}})+{\bar{\sigma }}_{q}^{2}}{{\mathrm{diag}}_{-}(\bar{\bar{{\rm{\Sigma }}}})}},\end{eqnarray*}$

where ${\mathrm{diag}}_{-}\,(\bar{\bar{M}})$ extracts the diagonal elements of matrix $\bar{\bar{M}}$ and ${\mathrm{diag}}_{+}\,(\bar{v})$ builds a diagonal matrix with vector $\bar{v}$ .

The need for a time-consuming matrix inversion of ${\bar{\bar{{\rm{\Sigma }}}}}^{{\prime} }$ for each star to evaluate Equation (10) can be avoided with:

$\begin{eqnarray*}{\bar{\bar{{\rm{\Sigma }}}}}^{{\prime} -1} & = & {\bar{\bar{G}}}^{-1}\,{\bar{\bar{{\rm{\Sigma }}}}}^{-1}\,{\bar{\bar{G}}}^{-1},\\ {\bar{\bar{G}}}^{-1} & = & {\mathrm{diag}}_{+}\sqrt{\displaystyle \frac{{\mathrm{diag}}_{-}\,(\bar{\bar{{\rm{\Sigma }}}})}{{\mathrm{diag}}_{-}\,(\bar{\bar{{\rm{\Sigma }}}})+{\bar{\sigma }}_{q}^{2}}}.\end{eqnarray*}$

Including an approximated effect of the proper motion measurement errors will thus require a calculation of (1) the $\bar{{\rm{\Omega }}}$ , $\bar{{\rm{\Gamma }}}$ and $\bar{\tau }$ vectors and their scalar products, (2) ν_o and ϖ_o, (3) the ${\bar{\sigma }}_{q}$ vector and the corresponding inflated matrix ${\bar{\bar{{\rm{\Sigma }}}}}^{{\prime} }$ , and (4) all quantities from the beginning of Section 3.2 obtained with the updated scalar product based on ${\bar{\bar{{\rm{\Sigma }}}}}^{{\prime} -1}$ . The approximation described here does not make the assumption that a star is a member of any young association or the field. Instead, the proper motion errors are propagated to XYZUVW independently for each Bayesian hypothesis, ensuring that it is valid near the peak of the probability distribution of each hypothesis. In other words, the steps described above are carried out independently when calculating the membership probability of each hypothesis.

3.6. Parallax Motion

When the proper motion of a nearby star is measured based on two epochs only, the measurement may be contaminated in part by parallax motion. This is true because the measurement of a star's displacement between two epochs include the compounding effects of its true proper motion (i.e., its space velocity projected on the celestial sphere) with that of its displacement along its parallactic ellipse, the latter of which is caused by a change in the observer's point of view as the Earth progresses on its orbit around the Sun. These two effects can however be decoupled in the BANYAN Σ formalism, as long as the parallax factors (ψ_α, ψ_δ), described below, are measured for the star. The parallax factors physically represent the motion of the star purely due to the Earth's motion between the two epochs if the star was placed at exactly 1 pc from the Sun.

The parallax motion $({{\rm{\Delta }}}_{\alpha \pi },{{\rm{\Delta }}}_{\delta \pi })$ of a star is given by Smart & Green (1977):

$\begin{eqnarray}&&{{\rm{\Delta }}}_{\alpha \pi }\cos \delta =\displaystyle \frac{\cos \alpha \cos e\sin {{\ell }}_{s}-\sin \alpha \cos {{\ell }}_{s}}{\varpi },\end{eqnarray} \tag{ 16 }$

$\begin{eqnarray}{{\rm{\Delta }}}_{\delta \pi } & = & \displaystyle \frac{\cos \delta \sin e\sin {{\ell }}_{s}-\cos \alpha \sin \delta \cos {{\ell }}_{s}}{\varpi }\\ & & -\,\displaystyle \frac{\sin \alpha \sin \delta \cos e\sin {{\ell }}_{s}}{\varpi },\end{eqnarray} \tag{ 17 }$

where e is the obliquity of the ecliptic of the Earth's orbit and ℓ_s is the ecliptic longitude of the Sun at a given epoch.¹⁴ These equations can be simplified by grouping all epoch-dependent terms into (ϕ_α, ϕ_δ):

$\begin{eqnarray}&&{{\rm{\Delta }}}_{\alpha \pi }(\alpha ,t)\cos \delta =\displaystyle \frac{{\phi }_{\alpha }(\alpha ,t)}{\varpi },\end{eqnarray} \tag{ 18 }$

$\begin{eqnarray}&&{{\rm{\Delta }}}_{\delta \pi }(\alpha ,\delta ,t)=\displaystyle \frac{{\phi }_{\delta }(\alpha ,\delta ,t)}{\varpi },\end{eqnarray} \tag{ 19 }$

where t is the epoch.

The apparent motion $({\mu }_{\alpha }^{{\prime} },{\mu }_{\delta }^{{\prime} })$ of a star between epochs t₁ and t₂ will thus be given by:

$\begin{eqnarray*}{\mu }_{\delta }^{{\prime} } & = & \displaystyle \frac{(\delta ({t}_{2})+{{\rm{\Delta }}}_{\delta \pi }({t}_{2}))-(\delta ({t}_{1})+{{\rm{\Delta }}}_{\delta \pi }({t}_{1}))}{{t}_{2}-{t}_{1}}\\ & = & {\mu }_{\delta }+\displaystyle \frac{1}{\varpi }\displaystyle \frac{{\phi }_{\delta }({t}_{2})-{\phi }_{\delta }({t}_{1})}{{t}_{2}-{t}_{1}}\\ & = & {\mu }_{\delta }+\displaystyle \frac{{\psi }_{\delta }}{\varpi },\mathrm{and}\,\mathrm{similarly}:\\ {\mu }_{\alpha }^{{\prime} }\cos \delta & = & {\mu }_{\alpha }\cos \delta +\displaystyle \frac{{\psi }_{\alpha }}{\varpi }.\end{eqnarray*}$

Since Equation (6) is linear in proper motion components, it can be expressed as a function of apparent motion with the form:

$\begin{eqnarray}\bar{Q} & = & \bar{{\rm{\Omega }}}\nu +{\bar{{\rm{\Gamma }}}}^{{\prime} }\varpi -\bar{{\rm{\Phi }}},\\ \bar{{\rm{\Phi }}} & = & (0,{\boldsymbol{ \mathcal T }}{\boldsymbol{ \mathcal A }}\,{\boldsymbol{\psi }}),\\ {\boldsymbol{\psi }} & = & \kappa (0,{\psi }_{\alpha },{\psi }_{\delta }),\\ {\psi }_{\alpha } & = & \displaystyle \frac{\varpi \cos \delta ({{\rm{\Delta }}}_{\alpha \pi }({t}_{2})-{{\rm{\Delta }}}_{\alpha \pi }({t}_{1}))}{{t}_{2}-{t}_{1}},\\ {\psi }_{\delta } & = & \displaystyle \frac{\varpi ({{\rm{\Delta }}}_{\delta \pi }({t}_{2})-{{\rm{\Delta }}}_{\delta \pi }({t}_{1}))}{{t}_{2}-{t}_{1}},\end{eqnarray} \tag{ 20 }$

where ${\bar{{\rm{\Gamma }}}}^{{\prime} }$ is a function of apparent motion $({\mu }_{\alpha }^{{\prime} }\cos \delta ,{\mu }_{\delta }^{{\prime} })$ , the quantity that is directly measured, instead of true proper motion $({\mu }_{\alpha }\cos \delta ,{\mu }_{\delta })$ .

Because $\bar{Q}$ only appears in the Bayesian likelihood as relative to the center of the moving group model τ, the effect of parallax motion can be fully accounted for by shifting the UVW center of the young association kinematic model by $+{\boldsymbol{ \mathcal T }}{\boldsymbol{ \mathcal A }}\,{\boldsymbol{\psi }}$ , which is equivalent to shifting $\bar{\tau }$ by $+\bar{{\rm{\Phi }}}$ :

$\begin{eqnarray*}&&{\bar{\tau }}^{{\prime} }=\bar{\tau }+\bar{{\rm{\Phi }}}.\end{eqnarray*}$

As a consequence, the parallax motion can be accounted for by using the measured apparent motion as if it were a true proper motion, and replacing and $\bar{\tau }\to {\bar{\tau }}^{{\prime} }$ in the BANYAN Σ formalism. This requires measurements of the parallax factors (ψ_α, ψ_δ) for each star in addition to measurements of their apparent motion. In practice, this correction is applied by the BANYAN Σ software only when the keyword use_psi is explicitly used. This indicates that: (1) the proper motion that is input to BANYAN Σ was measured from two epochs only, (2) the effect of parallax motion was not corrected in the proper motion measurement and therefore it really is a measurement of apparent motion, and (3) the parallax factors are provided to BANYAN Σ using the same two epochs as those between which the proper motion was measured. If any of the above statements are not true, the parallax motion correction described above should not be used.

3.7. Additional Kinematic Observables

Only sky position and proper motion are required for BANYAN Σ to compute membership probabilities. However, radial velocities and/or distances can also be included as input measurements to obtain more accurate membership probabilities. This is similar to the functioning of BANYAN I (Malo et al. 2013) and BANYAN II (Gagné et al. 2014). In cases where a radial velocity measurement is available, Equation (3) can be rewritten as:

$\begin{eqnarray*}&&{ \mathcal P }(\{{O}_{i}\}| H)={\int }_{-\infty }^{\infty }{{ \mathcal P }}_{m}(\nu ){\int }_{0}^{\infty }\displaystyle \frac{{\varpi }^{4}{e}^{-\tfrac{1}{2}{{ \mathcal M }}^{2}}}{\sqrt{{(2\pi )}^{6}| {\bar{\bar{{\rm{\Sigma }}}}}^{{\prime} }| }}\,d\varpi d\nu ,\end{eqnarray*}$

where ${{ \mathcal P }}_{m}(\nu )$ is the probability density function that represents the radial velocity measurement.

Assuming that ${{ \mathcal P }}_{m}(\nu )$ is a Gaussian distribution centered on ν_m with a characteristic width of ${\sigma }_{\nu ,m}$ , the equation above can be solved with a similar method to that described in Appendix B, where the following scalar products are modified with:

$\begin{eqnarray*}\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Omega }}}\rangle & \to & \langle \bar{{\rm{\Omega }}},\bar{{\rm{\Omega }}}\rangle +{({\sigma }_{\nu ,m})}^{-2},\\ \langle \bar{{\rm{\Omega }}},\bar{\tau }\rangle & \to & \langle \bar{{\rm{\Omega }}},\bar{\tau }\rangle +{\nu }_{m}{({\sigma }_{\nu ,m})}^{-2},\\ \langle \bar{\tau },\bar{\tau }\rangle & \to & \langle \bar{\tau },\bar{\tau }\rangle +{\nu }_{m}^{2}{({\sigma }_{\nu ,m})}^{-2}.\end{eqnarray*}$

The case where a distance measurement is available can be solved in a similar way, by replacing:

$\begin{eqnarray*}\langle \bar{{\rm{\Gamma }}},\bar{{\rm{\Gamma }}}\rangle & \to & \langle \bar{{\rm{\Gamma }}},\bar{{\rm{\Gamma }}}\rangle +{({\sigma }_{\varpi ,m})}^{-2},\\ \langle \bar{{\rm{\Gamma }}},\bar{\tau }\rangle & \to & \langle \bar{{\rm{\Gamma }}},\bar{\tau }\rangle +{\varpi }_{m}{({\sigma }_{\varpi ,m})}^{-2},\\ \langle \bar{\tau },\bar{\tau }\rangle & \to & \langle \bar{\tau },\bar{\tau }\rangle +{\varpi }_{m}^{2}{({\sigma }_{\varpi ,m})}^{-2}.\end{eqnarray*}$

The case where both radial velocity and distance measurements are available can be solved by combining all of the variable changes described above (the two changes on $\langle \bar{\tau },\bar{\tau }\rangle$ must be cumulated).

3.8. Photometric Observables

It is possible to constrain the distance of a star from its position in a color–magnitude or spectral type–magnitude diagram by comparing its absolute magnitude to a sequence of field stars, or to members of a young association, at a fixed color or spectral type. The position of a sequence in most of these diagrams is dependent on the age of its population, which translates to a different distance constraint for each Bayesian hypothesis.

Such photometric constraints can be included in the BANYAN Σ framework, in a similar way to the method described in Section 3.7 for distance measurements, except that different values of the most likely distance ϖ_m and its uncertainty ${\sigma }_{\varpi ,m}$ must be used, one for each hypothesis, because they derive from different color–magnitude sequences. This is a consequence of the fact that the different Bayesian hypotheses correspond to populations of stars at different ages.

In the absence of a trigonometric distance measurement, users can create custom color–magnitude diagrams and determine values of $({\varpi }_{m},{\sigma }_{\varpi ,m})$ for the field and each young association, and provide them to BANYAN Σ for a full inclusion of these constrainst in the Bayesian probabilities. Multiple color–magnitude diagrams can also be combined into single measurements of $({\varpi }_{m},{\sigma }_{\varpi ,m})$ for a given star and young association, but failing to account for covariances between different photometric bands in a given stellar population would result in artificially small values of ${\sigma }_{\varpi ,m}$ . Such unrealistically precise constraints on the distance of a star would hinder the ability of BANYAN Σ to correctly identify the candidate members of a young association. Both the IDL and Python implementations of BANYAN Σ can accept these photometric distance constraints through the constraint_dist_per_hyp and constraint_edist_per_hyp keywords, which are detailed in the documentation of the code. In summary, distinct color–magnitude diagrams for each hypothesis can be included by providing BANYAN Σ with distinct photometric constraints on the distance of a star.

In the cases where both parallax and photometric measurements are available, they can be included in a more straightforward way to BANYAN Σ through the Bayesian priors: the vertical distances in a color–magnitude diagram between the measured absolute magnitude and the sequence of field or young objects at a fixed color can be transformed to a probability for each association using Bayes' theorem, and the natural logarithm of these photometric probabilities can be included in BANYAN Σ with the ln_priors keyword available in both the IDL and Python implementations of the code.

Other age-dating observables, such as X-ray, UV, Hα, rotation and lithium abundance measurements can similarly be translated to a membership probability at the age of each young association (e.g., by comparing measurements with the X-ray luminosity distributions of Malo et al. 2014), and can also be included in the BANYAN Σ prior probabilities. This framework allows users to add observables in the BANYAN Σ membership determination without needing to change its algorithm, and remains accurate as long as no kinematic measurements are used to assign prior probabilities. The inclusion of such age indicators must rely on a user-specified method to translate each measurements into a probability that a given star is a member of each hypothesis, given the age of each young association. A compounded probability that each star is a member of each young association must then be calculated, based on only these youth indicators and no kinematics (e.g., by multiplying together the probabilities obtained from independent age indicators). The natural logarithm of these probabilities must then be input to BANYAN Σ with the keyword ln_priors. These data are passed to BANYAN Σ using a Python dictionary or an IDL structure depending on which version of the code is used, and we refer the reader to the respective documentations, which are provided as additional material to this manuscript, for more detail.

No color–magnitude sequences are provided here with the first version of BANYAN Σ, but they will be provided in future work as they are developed to target specific types of members.

3.9. Ignoring the Galactic Position XYZ

It is possible that the full spatial extent of some young associations have not yet been completely explored, especially at larger distances not covered by the Hipparcos survey. This possibility has been hypothesized by Bowler et al. (2017) among others, and the fact that the BANYAN tools rely on XYZ as well as UVW prevents an exploration of that potentially missing population of members.

The BANYAN Σ framework can be adapted to rely on UVW only, by artificially setting the first three diagonal elements of the covariance matrices $\bar{\bar{{\rm{\Sigma }}}}$ to a large value (e.g., 10⁹ pc²), and setting all other elements that contain at least one spatial component to zero. This approach is equivalent to using very large spatial widths for all models of young associations and the field in the BANYAN Σ formalism, and the general solution presented in Equation (12) remains unchanged. An option is provided in BANYAN Σ to ignore the spatial XYZ coordinates, and can be used to locate young association members that are spatially distant to the locus of known members. However, the rate of contamination from field stars is ∼100 times larger when using this option (see Section 8), and we therefore recommend extreme caution when using it.

4. Bona Fide Members of Young Associations within 150 pc

In this section, a list of bona fide members of young associations within 150 pc is compiled. This list will constitute the training set for the Gaussian models used in BANYAN Σ (see Sections 3.1 and 5 for more detail on the models). Each association considered in this work is listed in Table 1 with its age estimate and the total number of bona fide members that were compiled. In the literature, stars are typically considered bona fide members when they benefit from signs of youth and full kinematic measurements that allow them to be placed in XYZUVW space (Malo et al. 2013; Gagné et al. 2014). The indicators of youth used in the literature depend on the spectral type of the stars, and include isochronal age determinations through color–magnitude positions, lithium measurements, UV or X-ray luminosity, rotational velocity, Hα emission, and rotational velocity; see Soderblom et al. (2014) for a review of these age-dating methods. Here we require the same measurements for bona fide members of the nearest or most well-studied young associations. Nine of the 27 associations that are further away than ∼90 pc do not have enough members with full 6D kinematics to require them in the construction of their model. In these cases, an average radial velocity or parallax (or both) for the association are used instead of individual measurements.

Table 1. General Characteristics and Bayesian Priors of Young Associations

Asso.	N_k^a	$\mathrm{ln}{\alpha }_{k}$ ^b				$\langle \varpi \rangle$ ^c	$\langle \nu \rangle$ ^d	S_spa^e	S_kin^f	Age	Age
		μ	μ, ν	μ, ϖ	μ, ν, ϖ	(pc)	(km s⁻¹)	(pc)	(km s⁻¹)	(Myr)	Ref.
118TAU	10	−17.22	−18.60	−21.37	−22.66	100 ± 10	14 ± 2	3.4	2.1	∼10	1
ABDMG	48	−14.11	−15.39	−16.56	−17.60	${30}_{-10}^{+20}$	${10}_{-20}^{+10}$	19.0	1.4	${149}_{-19}^{+51}$	2
βPMG	42	−13.57	−14.77	−17.39	−18.24	${30}_{-10}^{+20}$	10 ± 10	14.8	1.4	24 ± 3	2
CAR	7	−13.41	−14.82	−18.45	−19.15	60 ± 20	20 ± 2	11.8	0.8	${45}_{-7}^{+11}$	2
CARN	13	−15.51	−16.85	−17.64	−18.55	30 ± 20	${15}_{-10}^{+7}$	14.0	2.1	∼200	3
CBER	40	−13.70	−15.09	−22.32	−23.43	${85}_{-5}^{+4}$	−0.1 ± 0.8	3.6	0.5	${562}_{-84}^{+98}$	4
COL	23	−13.08	−14.10	−17.74	−18.34	50 ± 20	${21}_{-8}^{+3}$	15.8	0.9	${42}_{-4}^{+6}$	2
CRA	12	−17.55	−19.07	−21.89	−22.89	139 ± 4	−1 ± 1	1.5	1.7	4–5	5
EPSC	25	−17.47	−18.59	−22.38	−22.79	102 ± 4	14 ± 3	2.8	1.8	${3.7}_{-1.4}^{+4.6}$	6
ETAC	16	−20.19	−21.36	−25.75	−26.22	95 ± 1	20 ± 3	0.6	2.0	11 ± 3	2
HYA	177	−20.02	−21.54	−22.14	−23.57	42 ± 7	${39}_{-4}^{+3}$	4.5	1.2	750 ± 100	7
IC2391	16	−18.05	−18.90	−21.55	−21.55	149 ± 6	15 ± 3	2.2	1.4	50 ± 5	8
IC2602	17	−15.33	−16.60	−22.50	−22.60	146 ± 5	17 ± 3	1.8	1.1	${46}_{-5}^{+6}$	9
LCC	82	−13.13	−14.27	−17.76	−17.77	110 ± 10	14 ± 5	11.6	2.2	15 ± 3	10
OCT	14	−11.74	−11.56	−13.85	−11.29	${130}_{-20}^{+30}$	${8}_{-9}^{+8}$	22.4	1.3	35 ± 5	11
PL8	11	−13.28	−14.54	−19.37	−19.75	130 ± 10	22 ± 2	5.0	1.1	∼60	12
PLE	190	−18.72	−20.06	−20.61	−21.46	134 ± 9	6 ± 2	4.1	1.4	112 ± 5	13
ROPH	186	−17.49	−19.04	−24.10	−25.55	131 ± 1	−6.3 ± 0.2	0.7	1.6	< 2	14
TAU	122	−10.39	−11.37	−17.04	−17.99	120 ± 10	16 ± 3	10.7	3.6	1–2	15
THA	39	−16.48	−17.78	−19.58	−20.25	${46}_{-6}^{+8}$	${9}_{-6}^{+5}$	9.1	0.8	45 ± 4	2
THOR	35	−14.05	−15.57	−20.22	−21.12	96 ± 2	19 ± 3	3.9	2.1	${22}_{-3}^{+4}$	2
TWA	23	−16.99	−18.22	−20.42	−20.93	60 ± 10	10 ± 3	6.6	1.5	10 ± 3	2
UCL	103	−11.70	−13.10	−15.91	−16.19	130 ± 20	5 ± 5	17.4	2.5	16 ± 2	10
UCRA	10	−15.85	−16.87	−20.24	−20.48	147 ± 7	−1 ± 3	4.5	1.8	∼10	16
UMA	9	−23.14	−24.01	−26.44	−27.13	${25.4}_{-0.7}^{+0.8}$	−12 ± 3	1.2	1.3	414 ± 23	17
USCO	84	−12.77	−13.71	−17.62	−17.96	130 ± 20	−5 ± 4	9.9	2.8	10 ± 3	10
XFOR	11	−19.37	−20.80	−23.43	−23.72	100 ± 6	19 ± 2	2.6	1.3	∼500	18

Notes. See Sections 4 and 5 for more detail. The full names of young associations are: 118 Tau (118TAU), AB Doradus (ABDMG), β Pictoris (βPMG), Carina (CAR), Carina-Near (CARN), Coma Berenices (CBER), Columba (COL), Corona Australis (CRA), Chamaeleontis (EPSC), η Chamaeleontis (ETAC), the Hyades cluster (HYA), Lower Centaurus Crux (LCC), Octans (OCT), Platais 8 (PL8), the Pleiades cluster (PLE), ρ Ophiuchi (ROPH), the Tucana-Horologium association (THA), 32 Orionis (THOR), TW Hya (TWA), Upper Centaurus Lupus (UCL), Upper CrA (UCRA), the core of the Ursa Major cluster (UMA), Upper Scorpius (USCO), Taurus (TAU) and χ¹ For (XFOR).

^aNumber of bona fide members included in the kinematic model. ^bBayesian prior ensuring a recovery rate of ≈50%–90% for a treshold P = 90%, depending on input observables. ^cPeak of distance distribution and ±1σ range. ^dPeak of radial velocity distribution and ±1σ range. ^eCharacteristic spatial scale in XYZ space. ^fCharacteristic kinematic scale in UVW space.

References. (1) Mamajek (2016), (2) Bell et al. (2015), (3) Zuckerman et al. (2006), (4) Silaj & Landstreet (2014), (5) Gennaro et al. (2012), (6) Murphy et al. (2013), (7) Brandt & Huang (2015), (8) Barrado y Navascués et al. (2004), (9) Dobbie et al. (2010), (10) Pecaut & Mamajek (2016), (11) Murphy & Lawson (2015), (12) Platais et al. (1998), (13) Dahm (2015), (14) Wilking et al. (2008), (15) Kenyon & Hartmann (1995), (16) this paper, (17) Jones et al. (2015b), (18) Pöhnl & Paunzen (2010).

Download table as: ASCII Typeset image

In Section 4.1, we provide a summarized description of the 18 young associations for which models are built using the individual 6D kinematics of all members. The nine associations with incomplete kinematics are described in Section 4.2 and their adopted average distances and radial velocities are listed in Table 2. A description the Argus association, included in BANYAN II but excluded here, is provided in Section 4.3. A few individual objects that require further attention are discussed in Section 4.4. The addition of Gaia-DR1 data to several previously recognized high-probability candidate members of the young associations studied here makes them new bona fide members: a description of these new members is provided in Section 4.5. The method that we used to calculate 6D kinematics and their error bars from kinematic observables is described in Section 4.6.

All members compiled in this section were cross-matched with the 2MASS, AllWISE (Wright et al. 2010; Kirkpatrick et al. 2014) and the Gaia-DR1 catalogs. When available, sky positions, proper motions and parallaxes from the Gaia-DR1 catalog were preferred to literature measurements. Targets with no radial velocity measurements reported in the literature membership lists were cross-matched with various catalogs that provide radial velocities (Upgren & Harlow 1996; Hawley et al. 1997; Bobylev et al. 2006; Torres et al. 2006; Kharchenko et al. 2007; White et al. 2007; Fernández et al. 2008; Shkolnik et al. 2011; Chubak et al. 2012; Kordopatis et al. 2013; Malo et al. 2014). When authors did not report radial velocity measurement errors, we adopted those calculated by Riedel et al. (2017b; see their Table 6).

4.1. Associations With Full Kinematics

In this section, we provide a short description of the 18 young associations included in BANYAN Σ for which the models will be built only from their classical bona fide members, in the sense that only members with signs of youth and full 6D kinematics are included.

The list of bona fide members presented in Gagné et al. (2014), which was largely based on that of Malo et al. (2013), was used as a starting point in this work. The Gagné et al. (2014) list includes members of TWA, βPMG, THA, CAR, COL, ARG and ABDMG. Gagné et al. (2017b) compiled an updated list of TWA members with new available data, and rejected contaminants from more distant associations, and was therefore adopted here. We refer the reader to Zuckerman & Song (2004) and Torres et al. (2008) for a detailed description of these associations.

Carina-Near (CARN) has been identified as a co-moving group of ∼200 Myr old stars by Zuckerman et al. (2006), which includes a core of eight members and ten members that are part of a spatially larger stream. Few studies have focused on this moving group since its discovery, likely because it is among the older ones. Gagné et al. (2017a) used a preliminary version of BANYAN Σ to show that the variable T2.5 brown dwarf SIMP J013656.5+093347 is likely a ∼13 M_Jup member of the CARN stream, and Riedel et al. (2017b) included CARN in a young association classification tool (LACEwING) for the first time.

The Ursa Major cluster (UMA; e.g., Eggen 1992) is a well-studied population of co-moving stars, consisting of a core of coeval young stars, and a stream of stars with heterogeneous compositions and ages. Soderblom & Mayor (1993) estimated an age of ∼300 Myr for the core population, and Jones et al. (2015a, 2017) estimated an age of 414 ± 23 Myr based on interferometric measurements of its A-type members. The core membership lists of King et al. (2003) was adopted for this work, and the stream was not included in BANYAN Σ because of its heterogeneous nature.

The Hyades cluster (HYA) is a nearby (40–50 pc) and relatively young (600–800 Myr; Perryman et al. 1998) cluster that has been extensively studied in the literature (e.g., Perryman et al. 1998; Zuckerman & Song 2004). The membership list of Perryman et al. (1998) was adopted here.

The Upper Scorpius, (USCO), Upper Centaurus-Lupus (UCL) and Lower Centaurus-Crux (LCC) groups are part of the Sco-Cen star-forming region (Blaauw 1946; de Zeeuw et al. 1999), which consists of 5–30 Myr stars located at distances of ∼110–150 pc. The membership lists of Rizzuto et al. (2011), Pecaut & Mamajek (2016) and Donaldson et al. (2017) were adopted here. Rizzuto et al. (2011) only list membership probabilities for the Sco-Cen region, and do not classify their members in the three subgroups. Their list was therefore cross-matched with that of de Zeeuw et al. (1999) to assign the correct sub-group, but all members of Sco-Cen that were newly discovered by Rizzuto et al. (2011) were not included at this stage. Once completed, the BANYAN Σ tool can be used to assign these new members to the correct subgroup; this is done in Section 9. Several radial velocities for USCO, UCL and LCC cataloged by Kharchenko et al. (2007)—which seem to be mistakenly listed as originating from Gontcharov (2006)—are astrometric¹⁵ radial velocities assuming moving group membership and an average UVW velocity. These measurements were rejected from our compilation.

The Octans association (OCT; Torres et al. 2008) is a group of young stars at ≈120 pc from the Sun, which has not been characterized as well as other young associations mainly due to its sky position located far in the Southern hemisphere (decl. between −87 and −20°). Murphy & Lawson (2015) performed a survey of its low-mass members and determined a lithium age of 20–40 Myr for this group. The members of OCT were compiled from Murphy & Lawson (2015).

The Pleiades cluster (PLE; Cummings 1921; Stauffer et al. 1989) is one of the best-studied clusters in the solar neighborhood. It is located at a distance of ∼130 pc and recent estimates of its age based on its lithium depletion boundary are in the range ∼110–120 Myr (Dahm 2015). The membership lists of Stauffer et al. (2007) and Galli et al. (2017) were adopted here. Sarro et al. (2014) presented a Bayesian method to identify members of the pleiades based on multivariate Gaussians mixture models. This method differs from BANYAN Σ in that it does not consider other young associations, includes various photometric colors, works directly in proper motion space, and does not consider sky position because they study stars in the direction of the cluster only. The larger number of free parameters introduced by a mixture of multivariate Gaussians makes it possible to model the proper motion and color–magnitude distribution of the Pleiades members, which are not well represented by a single Gaussian distribution. The large number of known Pleiades members allows such a highly parametrized model, but it would likely be challenging to apply this method to sparser or nearby young associations.

Coma Berenices (CBER; also called Melotte 111 and Collinder 256; e.g., Casewell et al. 2006) is a massive and well-studied open cluster located at ∼85 pc. Silaj & Landstreet (2014) estimate an age of ${560}_{-80}^{+100}\,\mathrm{Myr}$ based on the Hertzsprung–Russell diagram position of its Ap-type stars. The membership lists of Casewell et al. (2006), Kraus & Hillenbrand (2007) and Casewell et al. (2014) were used here.

IC 2602 (Melotte 102; Whiteoak 1961) is one of the nearest open clusters, and is located near the Sco-Cen OB region. Its members are located at a distance of ≈150 pc (van Leeuwen 2009) and the cluster has a lithium depletion boundary age of ${46}_{-5}^{+6}\,\mathrm{Myr}$ (Dobbie et al. 2010). The list of members published by Silaj & Landstreet (2014) and Mermilliod et al. (2009) were adopted as a starting point for BANYAN Σ.

IC 2391 (Omicron Velorum; Platais et al. 2007) is a ∼50 ± 5 Myr old cluster (Barrado y Navascués et al. 2004) located at ≈150 pc, and is also in the vicinity of the Sco-Cen OB region. The membership list of Gaia Collaboration et al. (2017) was used here.

4.2. Associations With Partial Kinematics

This section describes the nine young associations that do not have enough known members with full 6D kinematics to build their kinematic models based on only such members. Instead, an average radial velocity or distance (or both) are adopted for the young association itself. These average distances are obtained by calculating the weighted average of all members with measured distances, where the weights are set to the inverse square of the measurement errors. The error bars on the average distance correspond to an estimate of the intrinsic distance dispersion of the members, rather than a proper measurement error of the average, and was obtained using an unbiased weighted standard deviation¹⁶ of the individual members' distance measurements, with the same weights as described above. The average radial velocities are calculated with the same method, but spectral binaries were avoided in their calculation. All average distances and/or radial velocities that were measured in this section and used in the construction of the kinematic models are listed in Table 2.

Table 2. Adopted Average Distances and Radial Velocities for Young Associations with Partial Kinematics

Association	ν_avg	σ_ν	N_ν	ϖ_avg	σ_ϖ	N_ϖ
	(km s⁻¹)	(km s⁻¹)		(pc)	(pc)
EPSC	⋯	⋯	⋯	102.3	5.7	8
ETAC	20.0	3.1	14	94.4	2.0	14
THOR	⋯	⋯	⋯	96.2	3.5	4
XFOR	18.8	1.4	5	⋯	⋯	⋯
PL8	21.9	2.0	4	⋯	⋯	⋯
ROPH	−6.3	0.3	^a	131.0	3.0	^a
CRA	−0.4	1.2	11	139.4	6.1	3
UCRA	−2.5	2.4	9	148.0	3.0	4
TAU	17.0	2.9	119	126	16	30
118TAU	14.7	1.1	8	112.4	5.6	6

Note.

^aAverage observables taken from Mamajek (2008).

Download table as: ASCII Typeset image

Table 3. Objects Listed as Bona Fide Members that Were Excluded from the Construction of Kinematic Models

Main	Spectral	R.A.	Decl.	${\mu }_{\alpha }\cos \delta$	μ_δ	Rad. Vel.	Distance	Reason for	References
Designation	Type	(hh:mm:ss)	(dd:mm:ss)	(mas yr⁻¹)	(mas yr⁻¹)	(km s⁻¹)	(pc)	Exclusion^a
AB Doradus
PX Vir	G5 V	13:03:49.46	−05:09:45.9	−191.1 ± 0.9	−218.7 ± 0.7	−9.2 ± 0.2	21.7 ± 0.4	VIS	1,2,3,2
2MASS J15534211–2049282	M3.4	15:53:42.09	−20:49:28.6	−10 ± 10	−20 ± 10	−7 ± 2	330 ± 80	σ_kin	4,5,4,4
HR 7214	A4 V	19:03:32.25	01:49:07.6	22.6 ± 0.2	−68.5 ± 0.3	−23 ± 2	54.9 ± 0.9	MST₀	6,2,6,2
LQ Peg	K8 V	21:31:01.86	23:20:05.2	134.6 ± 0.1	−144.9 ± 0.1	−17 ± 1	24.2 ± 0.1	VIS	1,7,8,7
HIP 107948	M3 Ve	21:52:10.51	05:37:33.7	106 ± 2	−147 ± 1	−15 ± 2	31 ± 5	σ_kin	9,9,9,2
${\boldsymbol{\beta }}$ Pictoris
LP 353–51	M3 Ve	02:23:26.75	22:44:05.1	98.5 ± 0.2	−112.5 ± 0.1	13.0 ± 0.4	27.1 ± 0.3	VIS	1,7,10,7
HD 15115	F4 IV	02:26:16.33	06:17:32.4	87.97 ± 0.04	−50.35 ± 0.03	0.8 ± 0.1	48 ± 1	VIS	11,7,12,7
EPIC 211046195	M8.5	03:35:02.09	23:42:35.6	50 ± 10	−60 ± 10	16 ± 2	42 ± 2	VIS	4,5,4,4
HIP 23418 ABCD	M3 V	05:01:58.83	09:58:57.2	10 ± 10	−74 ± 6	18 ± 3	25 ± 1	VIS	13,2,14,15

Note.

^aReason for exclusion from the kinematic models. σ_kin: Excluded from the kinematic precision constraints described in Section 5; Host: The host star was excluded; MST_i: Excluded from iteration i of the Minimum Spanning Tree outlier rejection algorithm described in Section 5; Nσ: Excluded from the Mahalanobis distance criterion described in Section 5; VIS: Visual rejection, see Table 8.

References. (1) Malo et al. (2013), (2) van Leeuwen (2007), (3) Maldonado et al. (2010), (4) Shkolnik et al. (2012), (5) Monet et al. (2003), (6) Zuckerman et al. (2011), (7) Gaia Collaboration et al. (2016a), (8) Montes et al. (2001), (9) Zickgraf et al. (2005), (10) Shkolnik et al. (2017), (11) Lépine & Simon (2009), (12) Desidera et al. (2015), (13) Egret et al. (1992), (14) Song et al. (2003), (15) Riedel et al. (2014), (16) Malo et al. (2014), (17) Zacharias et al. (2010), (18) van Altena et al. (1995), (19) Houk & Swift (1999), (20) Perryman et al. (1998), (21) Lépine et al. (2013), (22) Wilson (1953), (23) White et al. (2007), (24) Griffin et al. (1988), (25) Cayrel de Strobel et al. (2001), (26) Stephenson (1986b), (27) Gray et al. (2003), (28) Nesterov et al. (1995), (29) Stephenson & Sanwal (1969), (30) Stephenson (1986a), (31) Evans (1967), (32) Royer et al. (2007), (33) Torres et al. (2006), (34) Mason et al. (2001), (35) Neuhäuser et al. (2003), (36) Zuckerman & Song (2004), (37) Gontcharov (2006), (38) Anderson & Francis (2012), (39) Bobylev et al. (2006), (40) Kraus et al. (2014), (41) Zuckerman & Webb (2000), (42) Grenier et al. (1999), (43) Pecaut & Mamajek (2013), (44) Elliott et al. (2014), (45) Webb et al. (1999), (46) Ducourant et al. (2014), (47) Pourbaix et al. (2004), (48) Kraus & Hillenbrand (2007), (49) Luhman & Steeghs (2004), (50) Shvonski et al. (2016), (51) M. J. Pecaut (2018, private communication), (52) Houk & Cowley (1975), (53) Kharchenko et al. (2007), (54) Pecaut & Mamajek (2016), (55) Houk (1978), (56) Torres et al. (2008), (57) Torres et al. (2003), (58) Zacharias et al. (2013), (59) Walter et al. (1988), (60) Kraus et al. (2017), (61) Wichmann et al. (2000), (62) Hartigan et al. (1994), (63) Welty et al. (1994), (64) Hartigan & Kenyon (2003), (65) Zacharias et al. (2012), (66) Martín et al. (2005), (67) Herbig (1977), (68) Roeser et al. (2010), (69) Hartmann et al. (1986), (70) Slesnick et al. (2006b), (71) Wichmann et al. (1997), (72) Houk (1982), (73) Erickson et al. (2011), (74) Houk & Smith-Moore (1988), (75) Cieza et al. (2007), (76) Prato (2007), (77) Guenther et al. (2007), (78) Lawrence et al. (2007), (79) Bouy & Martín (2009), (80) Donaldson et al. (2017), (81) Dahm et al. (2012), (82) Preibisch et al. (2002), (83) Mermilliod et al. (2009), (84) Kordopatis et al. (2013), (85) Galli et al. (2017), (86) Majewski et al. (2016), (87) Garcia et al. (1988).

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as: Data Typeset image

Table 4. New Bona Fide Members with Full Kinematics Compiled in this Work

Designation	References^a
AB Doradus
HS Psc	Malo et al. (2014)
HD 201919	Malo et al. (2014)
${\boldsymbol{\beta }}$ Pictoris
AF Psc	Shkolnik et al. (2017)
2MASS J16572029-5343316	Malo et al. (2014)
CD–31 16041	Malo et al. (2014)
2MASS J19560438-3207376	Malo et al. (2014)
2MASS J22424896-7142211	Malo et al. (2014)
BD–13 6424	Malo et al. (2014)
Columba
GJ 1284	Malo et al. (2014)
Tucana-Horologium
2MASS J02303239–4342232	Kraus et al. (2014)
2MASS J04000382–2902165	Kraus et al. (2014)
2MASS J04000395–2902280	Kraus et al. (2014)
2MASS J04021648–1521297	Kraus et al. (2014)
CD–34 521	Malo et al. (2014)
CD–53 544	Malo et al. (2014)
CD–58 553	Malo et al. (2014)
CD–35 1167	Malo et al. (2014)
CD–44 1173	Malo et al. (2014)
2MASS J04480066-5041255	Malo et al. (2014)
2MASS J05332558- 5117131	Malo et al. (2014)
2MASS J23261069-7323498	Malo et al. (2014)
${{\boldsymbol{\chi }}}^{1}$ For
CD–37 1263^b	Dias et al. (2014)
HD 17864	B. S. Alessi (2018, private communication)

Notes. This table lists objects that were designated as candidate members and that we confirm as bona fide members with full kinematics from compiling their missing measurements. See Section 4 for more details.

^aReference that designated this object as a candidate member of the young association. ^bThe age of CD–37 1263 has not been investigated here; verifying its youth is still necessary to confirm that it is a true bona fide member.

Download table as: ASCII Typeset image

Instead of assigning the exact same average distance or radial velocity to each members with a missing observable, artifical values were drawn from a random distribution (limited to ±1σ) with a characteristic width set to the measured intrinsic dispersion of the members (listed in Table 2). This avoids artifically placing several members along lines in XYZ and UVW space—or along planes in UVW space when both radial velocity and distance are missing. The lack of full 6D kinematics for a significant number of candidates will result in a lower recovery rate of their true members by BANYAN Σ, and a larger number of contaminants from field stars. Quantifying this effect in terms of exact true-positive and false-positive rates is however not currently possible given our lack of information on the true shapes and sizes of their spatial and kinematic distributions; however Gaia-DR2 will allow us to greatly refine the kinematic models of these associations.

${\boldsymbol{\epsilon }}$ Chamaeleontis (EPSC; Mamajek et al. 2000; Feigelson et al. 2003; Murphy et al. 2013) is a young (3–5 Myr) and relatively distant (100–120 pc) association that is part of the Chamaeleon molecular cloud complex (Luhman et al. 2008). Murphy et al. (2013) refined the age of EPSC to ${3.7}_{-1.4}^{+4.6}\,\mathrm{Myr}$ by comparing its members with the Dartmouth isochrones of Dotter et al. (2008). The membership list of Murphy et al. (2013) was adopted here.

The ${\boldsymbol{\eta }}$ Chamaeleontis cluster (ETAC; Mamajek et al. 1999) is a group of young (11 ± 3 Myr; Bell et al. 2015) stars located at a distance of ∼100 pc, and in the vicinity of the Sco-Cen OB association. The membership lists of Mamajek et al. (2000) and Lyo et al. (2004) were adopted in this work.

The 32 Orionis group (THOR; Mamajek 2007; Shvonski et al. 2010; Bell et al. 2017) is a young group of ∼25 Myr old stars located at ∼96 pc. The age of THOR was revised to ${22}_{-3}^{+4}\,\mathrm{Myr}$ by Bell et al. (2015) from a comparison of its members to model isochrones. Burgasser et al. (2016) recently identified the first substellar candidate member of THOR, with an estimated mass of 14 M_Jup near the planetary-mass boundary. The membership list of Bell et al. (2017) was adopted here.

The ${{\boldsymbol{\chi }}}^{1}$ For association (XFOR; also called Alessi 13) was identified by Dias et al. (2002), and Kharchenko et al. (2013) estimated an age of ∼525 Myr based on the main-sequence turnoff. However, Mamajek (2015) argue that it could be as young as ∼30 Myr due to the saturated X-ray emission of its members. Further studies will be required to address this discrepancy. Only one member of XFOR has been published with full kinematics (the triple star χ¹ For). In order to identify more members, the Dias et al. (2014) list of 4102 XFOR candidates was cross-matched with the Gaia-DR1. Of the 261 matches with a parallax measurement, only nine are located within 10 pc of the χ¹ For system in XYZ space. This indicates that the Dias et al. (2014) sample seems highly contaminated by background stars and we therefore recommend caution in its use. A literature search was performed to identify one additional radial velocity measurement for CD–37 1263, thus completing its kinematic measurements and making it a new likely bona fide member of XFOR (although its age is not investigated here). Six of the eight additional potential members are located within 5 km s⁻¹ of the star χ¹ For in UVW space if we assume the same radial velocity measurement, and they are therefore included as high-likelihood candidate members. Two additional XFOR members were identified by B. S. Alessi et al. (2018, private communication): HD 21434 and HD 17864. Both have a parallax measurement in Gaia-DR1, but only HD 17864 also has a radial velocity measurement available in the literature.

Platais 8 (or a Car; PL8) is a ∼60 Myr old cluster of stars at a distance of ∼130 pc identified by Platais et al. (1998). It has since received very little attention in the literature, and only four of its members benefit from full kinematic measurements (a Car, HD 76230, H Vel, OY Vel).

${\boldsymbol{\rho }}$ Ophiuchi (ROPH) is the nearest star-forming cloud complex to the Sun. It has been the subject of extensive studies in recent decades (e.g., see Reipurth et al. 1991; Reipurth 2008; Wilking et al. 2008). Its age is estimated at <2 Myr, and includes embedded clusters with stars believed to be as young as ∼0.1 Myr (Luhman & Rieke 1999). Because this group is too distant for its members to have been directly detected by the Hipparcos mission, Mamajek (2008) used the measured parallaxes of Hipparcos stars illuminating the Lynds 1688 dark cloud, which is part of ROPH, to estimate its distance at 131 ± 3 pc. They also estimate an average radial velocity of −6.3 ± 0.3 km s⁻¹ from individual radial velocity measurements of its members. The membership lists of Wilking et al. (2008) and Ducourant et al. (2017) were adopted here, and the average radial velocity and distance of Mamajek (2008) were adopted for all members with missing measurements. Only the 194 out of 340 members that have a proper motion measurement were used in the construction of the BANYAN Σ kinematic model of ROPH. A cross-match of these with Gaia DR1 yielded 84 matches, indicating that the second data release will provide a wealth of new information on the distances and proper motions of ROPH members.

The Corona Australis (CRA) star-forming region is located at a distance of ∼150 pc (Neuhäuser & Forbrich 2008; Reipurth 2008), and includes the well-studied R CrA dark cloud (e.g., see Wilking et al. 1992). Gennaro et al. (2012) estimated the age of the eclipsing binary system TY CrA between ${3.8}_{-0.2}^{+2.7}\,\mathrm{Myr}$ and ${5.2}_{-0.7}^{+3.1}\,\mathrm{Myr}$ , based on a comparison of the dynamical masses of its components with a set of warm and cold PISA pre-main-sequence models (Tognelli et al. 2011), respectively. Here we therefore adopt an age of ∼4–5 Myr for CRA. The membership list of Neuhäuser & Forbrich (2008) was adopted here.

Several stars in the vicinity of CRA discovered by Neuhäuser et al. (2000) were found to be located between CRA and the Sco-Cen region, and at similar distances to both of these regions. This population likely constitutes of stars that formed along a filament between CRA and Sco-Cen ∼10 Myr ago. We included them in the models of BANYAN Σ, and tentatively name this population Upper CrA (UCRA hereafter).

The Taurus-Auriga (TAU) star-forming region is a complex of several dark clouds located at ∼130 pc, and composed of stars with ages ∼1–2 Myr that share similar kinematics (e.g., see Kenyon & Hartmann 1995; Reipurth 2008). The membership lists of Luhman et al. (2009) and Esplin et al. (2014) were adopted here, without differentiating the sub-groups. Measurement errors were not provided for the TAU radial velocities measured by Wichmann et al. (2000), but they report two sets of measurements from two distinct instruments. We measured the standard deviations of the radial velocity differences for the 25 stars in their sample that were observed with both instruments, ignoring 10 spectral binaries and two stars with significantly different measurements (>8 km s⁻¹). We adopted this standard deviation of 2 km s⁻¹ as their radial velocity measurement errors.

Mamajek (2016) identified 11 stars in the vicinity of TAU that share a larger proper motion and a closer distance to the Sun than the rest of the group (see the discussions of Currie et al. 2017 and Kraus et al. 2017). This group, named after its brightest member 118 Tau (118TAU hereafter), seems to display a slightly younger age than TAU, at ∼10 Myr. The membership list of Mamajek (2016) was adopted here.

4.3. Rejected Associations

The Argus association was removed entirely from the models of BANYAN Σ, as Bell et al. (2015) demonstrated that it is either largely contaminated, or composed of objects that do not form a coeval association (see also Mamajek 2015). In addition, the Octans-Near (Zuckerman et al. 2013) and Hercules-Lyra associations (Gaidos 1998; Fuhrmann 2004; López-Santiago et al. 2006; Eisenbeiss et al. 2013) were not included in BANYAN Σ, as they were also demonstrated to be likely composed of non-coeval stars (Brandt et al. 2014; Mamajek 2015; Riedel et al. 2017b).

4.4. Discussion on Individual Objects

Individual stars that require more detailed considerations are discussed in this section. In addition to this, we note that several stars listed by different authors as bona fide members of different associations (e.g., V570 Car, CP–68 1388 and CD–69 1055) are excluded from the BANYAN Σ kinematic models and are listed in Table 3.

Table 5. Literature Compilation of Bona Fide Members

Main	Spectral	R.A.	Decl.	${\mu }_{\alpha }\cos \delta$	μ_δ	Rad. Vel.	Distance	References
Designation	Type	(hh:mm:ss)	(dd:mm:ss)	(mas yr⁻¹)	(mas yr⁻¹)	(km s⁻¹)	(pc)
AB Doradus
2MASS J00192626+4614078	M8 β	00:19:26.26	46:14:07.8	119.4 ± 0.9	−75.4 ± 0.9	−20 ± 3	39 ± 2	1, 2, 3, 2
BD+54 144 A	F8 V	00:45:51.06	54:58:39.1	96.40 ± 0.03	−73.97 ± 0.04	−15 ± 2	50.3 ± 0.9	4, 5, 6, 5
— BD+54 144 B	K3	00:45:51.23	54:58:40.8	⋯	⋯	⋯	⋯	7, ⋯, ⋯, ⋯
2MASS J00470038+6803543	L6–L8 γ	00:47:00.39	68:03:54.4	385 ± 1	−201 ± 1	−20 ± 1	12.2 ± 0.3	⋯, 2, 8, 2
G 132–51 B	M2.6	01:03:42.23	40:51:13.6	132 ± 5	−164 ± 5	−10.6 ± 0.3	30 ± 2	9, 9, 9, 9
— G 132–50	M0	01:03:40.30	40:51:26.7	126.9 ± 0.1	−166.18 ± 0.09	⋯	33.0 ± 0.6	10, 5, ⋯, 5
— G 132–51 C	M3.8	01:03:42.44	40:51:13.3	⋯	⋯	−10.9 ± 0.4	⋯	9,⋯, 9, ⋯
HIP 6276	G0 V	01:20:32.38	−11:28:05.8	111.44 ± 0.06	−136.95 ± 0.05	8.3 ± 0.4	35.1 ± 0.4	11, 5, 12, 5
G 269–153 A	M4.3	01:24:27.85	−33:55:11.4	180 ± 20	−110 ± 20	19 ± 3	25 ± 1	9, 13, 9, 9
— G 269–153 B	M4.6	01:24:27.96	−33:55:09.9	⋯	⋯	18 ± 1	⋯	9, ⋯, 9, ⋯

Note. References—The references to this table are listed in Table 6.

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as: Data Typeset image

AB Pic was incorrectly listed by Gagné et al. (2014) as a bona fide member of both THA and CAR. This was a consequence of Zuckerman & Song (2004) listing it as a bona fide member of THA and Torres et al. (2008) revising it to a bona fide member of CAR. Since its UVW position is at 0.7 ± 0.7 km s⁻¹ from the core of CAR and at 5.5 ± 2.1 km s⁻¹ from that of THA, here it was included in the list of CAR members (see also the discussions of Bell et al. 2015, Section B2.3, and Malo et al. 2013, Section 9.1.5).

DK Leo was listed by Malo et al. (2013) as an ambiguous candidate member between βPMG, COL and ABDMG, because of contradictory measurements for its radial velocity (Montes et al. 2001; Kharchenko et al. 2007; López-Santiago et al. 2010), and Gagné et al. (2014) incorrectly listed it as a bona fide member of both βPMG and CAR. Here it is excluded from the list of bona fide members until more radial velocity measurements become available.

HD 23524 was defined as a bona fide member of THA by Zuckerman et al. (2011), and Malo et al. (2013) defined it as a bona fide member of COL because it lies nearer in XYZ space, although they note that its membership is ambiguous. Here HD 23524 is excluded from the BANYAN Σ kinematic models.

HIP 3556 was noted as a radial velocity variable and a spectroscopic double-lined binary by Kastner et al. (2017). Until a full radial velocity curve is available, this object is excluded from the BANYAN Σ kinematic models.

2MASS J06085283–2753583 was identified by Rice et al. (2010) as a candidate member of βPMG and it now has full kinematic measurements, but Faherty et al. (2016) showed that it is an ambiguous member, which is probably due to its small proper motion. This object was therefore not included in the kinematic models of BANYAN Σ.

4.5. New Bona Fide Members

New bona fide members that could be defined as such based on Gaia-DR1 data are described in this section, and are listed in Table 4.

Table 6. References for Literature Compilation of Bona Fide Members

References

(1) Gagné et al. (2015b), (2) Liu et al. (2016), (3) Reiners & Basri (2009), (4) Jaschek et al. (1964), (5) Gaia Collaboration et al. (2016a), (6) Bobylev et al. (2006), (7) Zuckerman & Song (2004), (8) Faherty et al. (2016), (9) Shkolnik et al. (2012), (10) Lépine et al. (2013), (11) Malo et al. (2013), (12) White et al. (2007), (13) Monet et al. (2003), (14) Malo et al. (2014), (15) Schlieder et al. (2010), (16) van Leeuwen (2007), (17) Ivanov (2008), (18) Tokovinin & Smekhov (2002), (19) Garrison & Gray (1994), (20) Gontcharov (2006), (21) Bystrov et al. (1994), (22) Egret et al. (1992), (23) Anderson & Francis (2012), (24) Houk & Cowley (1975), (25) Zuckerman et al. (2011), (26) Blake et al. (2010), (27) Chubak & Marcy (2011), (28) Kordopatis et al. (2013), (29) Høg et al. (2000), (30) Torres et al. (2006), (31) Zacharias et al. (2004a), (32) Gray et al. (2006), (33) Martín & Brandner (1995), (34) Kharchenko et al. (2007), (35) Messina et al. (2010), (36) Montes et al. (2001), (37) Reid et al. (2004), (38) Kunder et al. (2017), (39) Terrien et al. (2015), (40) Knapp et al. (2004), (41) Dupuy & Liu (2012), (42) Gagné et al. (2015a), (43) Dieterich et al. (2014), (44) Röser et al. (2008), (45) Abt & Morrell (1995), (46) Zuckerman et al. (2001a), (47) Zacharias et al. (2010), (48) Riedel et al. (2014), (49) Valenti & Fischer (2005), (50) Holmberg et al. (2007), (51) Song et al. (2003), (52) Macintosh et al. (2015), (53) Zacharias et al. (2004b), (54) Allers & Liu (2013), (55) Lépine & Simon (2009), (56) Bobylev & Bajkova (2007), (57) Gizis et al. (2002), (58) Bonnefoy et al. (2013), (59) Torres et al. (2009), (60) Kiss et al. (2011), (61) Corbally (1984), (62) Liu et al. (2013), (63) Allers et al. (2016), (64) Shkolnik et al. (2017), (65) Gray et al. (2003), (66) Eggl et al. (2013), (67) King et al. (2003), (68) Levato & Abt (1978), (69) Evans (1967), (70) Fabricius et al. (2002), (71) Gray & Garrison (1987), (72) Mamajek et al. (2010), (73) Zuckerman et al. (2006), (74) Salim & Gould (2003), (75) Nordström et al. (2004), (76) Desidera et al. (2015), (77) Koen et al. (2010), (78) Griffin et al. (1988), (79) Joy & Wilson (1949), (80) Wilson (1953), (81) Perryman et al. (1998), (82) Stephenson (1986a), (83) Hussain et al. (2006), (84) Nesterov et al. (1995), (85) Cayrel de Strobel et al. (2001), (86) Stephenson (1986b), (87) Cenarro et al. (2009), (88) Gebran et al. (2010), (89) Stocke et al. (1991), (90) Gray et al. (2001), (91) Karataş et al. (2004), (92) Cenarro et al. (2007), (93) Pourbaix et al. (2004), (94) Morgan & Hiltner (1965), (95) Keenan & McNeil (1989), (96) Bilíková et al. (2010), (97) Morgan & Keenan (1973), (98) Paunzen et al. (2001), (99) Benedict et al. (2014), (100) Mermilliod et al. (2009), (101) Cowley & Fraquelli (1974), (102) Morse et al. (1991), (103) de Bruijne & Eilers (2012), (104) Christy & Walker (1969), (105) van Belle & von Braun (2009), (106) Tomkin et al. (1995), (107) Adams et al. (1935), (108) Gray & Garrison (1989a), (109) Kraft (1965), (110) Patel et al. (2013), (111) Abt & Levy (1985), (112) Wilson (1962), (113) Maderak et al. (2013), (114) Royer et al. (2007), (115) Nassau & Macrae (1955), (116) Cowley et al. (1969), (117) Zuckerman & Webb (2000), (118) Cruz et al. (2007), (119) Levato (1975), (120) Gagné et al. (2015c), (121) Głȩbocki & Gnaciński (2005), (122) Kraus et al. (2014), (123) Moór et al. (2006), (124) Houk (1982), (125) Ducourant et al. (2014), (126) Elliott et al. (2014), (127) Gagné et al. (2017b), (128) Teixeira et al. (2008), (129) Pecaut & Mamajek (2013), (130) Weinberger et al. (2013), (131) Webb et al. (1999), (132) Torres et al. (2003), (133) Shkolnik et al. (2011), (134) Looper et al. (2010b), (135) Donaldson et al. (2016), (136) Looper et al. (2010a), (137) Schneider et al. (2012), (138) Mohanty et al. (2003), (139) Zacharias et al. (2013), (140) Gizis et al. (2007), (141) Rodriguez et al. (2011), (142) Mamajek (2005), (143) Kraus & Hillenbrand (2007), (144) Yoss & Griffin (1997), (145) Massarotti et al. (2008), (146) Lyo et al. (2004), (147) Girard et al. (2011), (148) Luhman & Steeghs (2004), (149) Lopez Martí et al. (2013), (150) Bell et al. (2017), (151) Zacharias et al. (2015), (152) Shvonski et al. (2016), (153) Roeser et al. (2010), (154) Alcalá et al. (2000), (155) Mace et al. (2009), (156) Edwards (1976), (157) Houk (1978), (158) Jackson & Stoy (1955), (159) Mamajek (2015), (160) Murphy et al. (2013), (161) Torres et al. (2008), (162) Terranegra et al. (1999), (163) Kastner et al. (2012), (164) Guenther et al. (2007), (165) Mamajek et al. (2000), (166) Grenier et al. (1999), (167) Hales et al. (2014), (168) Grady et al. (2004), (169) Luhman (2004), (170) Riaz et al. (2006), (171) E. Bubar et al. (2018, in preparation), (172) Covino et al. (1997), (173) Li & Hu (1998), (174) Kraus et al. (2017), (175) Abt (2008), (176) Abt (2004), (177) Slesnick et al. (2006b), (178) Hiltner et al. (1969), (179) Pecaut & Mamajek (2016), (180) Song et al. (2012), (181) Chen et al. (2011), (182) Levenhagen & Leister (2006), (183) Lutz & Lutz (1977), (184) Moór et al. (2011), (185) Walter et al. (1988), (186) Riviere-Marichalar et al. (2012), (187) Wichmann et al. (2000), (188) Herbig (1977), (189) White & Basri (2003), (190) Hartmann et al. (1987), (191) Sartoretti et al. (1998), (192) Alves de Oliveira et al. (2012), (193) Xiao et al. (2012), (194) Hartmann et al. (1986), (195) Patterer et al. (1993), (196) Donati et al. (1997), (197) Herbig et al. (1986), (198) Hessman & Guenther (1997), (199) Appenzeller et al. (1988), (200) Mundt et al. (1983), (201) Sestito et al. (2008), (202) Hartigan & Kenyon (2003), (203) Strassmeier (2009), (204) Hartigan et al. (1994), (205) Monin et al. (2010), (206) Herbig (1990), (207) Mathieu et al. (1997), (208) Joy (1949), (209) Zacharias et al. (2012), (210) Mooley et al. (2013), (211) Esplin et al. (2014), (212) Cohen & Kuhi (1979), (213) Mora et al. (2001), (214) Duchêne et al. (1999), (215) Gahm et al. (1999), (216) Bragança et al. (2012), (217) Hube (1970), (218) Buscombe (1969), (219) Krautter et al. (1997), (220) Mamajek et al. (2002), (221) Köhler et al. (2000), (222) Erickson et al. (2011), (223) Cieza et al. (2007), (224) Prato (2007), (225) Struve & Rudkjøbing (1949), (226) Bouvier & Appenzeller (1992), (227) Wilking et al. (2005), (228) Lawrence et al. (2007), (229) Ansdell et al. (2016), (230) Manara et al. (2015), (231) Kurosawa et al. (2006), (232) Ricci et al. (2010), (233) Ducourant et al. (2017), (234) Suárez et al. (2006), (235) Martín et al. (1998), (236) Slesnick et al. (2006a), (237) Lodieu (2013), (238) Rydgren (1980), (239) Orellana et al. (2012), (240) Houk & Smith-Moore (1988), (241) Dahm et al. (2012), (242) Preibisch et al. (1998), (243) Donaldson et al. (2017), (244) Siebert et al. (2011), (245) Bouy & Martín (2009), (246) Preibisch & Zinnecker (1999), (247) Walter et al. (1994), (248) Abt (1981), (249) Preibisch et al. (2001), (250) Cucchiaro et al. (1976), (251) Carpenter et al. (2006), (252) Beavers & Cook (1980), (253) Donati et al. (2006), (254) Abt (2009), (255) Galli et al. (2017), (256) Majewski et al. (2016), (257) Cottaar et al. (2015), (258) McCarthy & Treanor (1964), (259) Breger (1984), (260) Abt & Levato (1978), (261) Mendoza V (1956), (262) Gray & Garrison (1989b), (263) Binnendijk (1946), (264) Walter et al. (1997), (265) Zacharias et al. (2017), (266) Melo (2003), (267) Carmona et al. (2007), (268) Forbrich & Preibisch (2007), (269) Vieira et al. (2003), (270) Corporon et al. (1996), (271) Garcia et al. (1988), (272) Messina et al. (2011), (273) De Silva et al. (2013), (274) Pickles & Depagne (2010), (275) Moór et al. (2013), (276) Bourgés et al. (2014), (277) Anderson & Francis (2012), (278) Bobylev et al. (2006), (279) Kharchenko et al. (2007), (280) White et al. (2007), (281) Torres et al. (2006), (282) Siebert et al. (2011), (283) Chubak et al. (2012), (284) Mermilliod et al. (2009).

Download table as: ASCIITypeset images: 1 2

The candidate members of THA identified by Kraus et al. (2014) were cross-matched with Gaia-DR1. Seven were found to have a parallax measurement: three of them did not match any moving group in BANYAN II and were therefore rejected (2MASS J02000918–8025009, 2MASS J02105538–4603588, and 2MASS J05332558–5117131), and the other four were confirmed as new bona fide members of THA (2MASS J02303239–4342232, 2MASS J04000382–2902165, 2MASS J04000395–2902280, 2MASS J04021648–1521297).

A similar cross-match of the Malo et al. (2014) candidate members missing a distance measurement with Gaia-DR1 yielded 21 matches, 16 of which were confirmed as new bona fide members (five in βPMG, eight in THA, two in ABDMG and one in COL). The UVW position of 2MASS J20395460+0620118 (an ABDMG candidate from Malo et al. 2014) is a better match to the Gagné et al. (2014) position of ARG or βPMG than that of ABDMG. It was therefore categorized as an ambiguous member until it is studied in more detail. Furthermore, Malo et al. (2014) assign 2MASS J02303239–4342232 in COL, whereas Kraus et al. (2014) call it a THA member, and it is therefore categorized as an ambiguous member in this compilation.

Shkolnik et al. (2017) performed a survey of new low-mass members in βPMG, and identified 39 new objects with signs of youth, sky position, proper motion, and radial velocities that match βPMG. As only parallaxes are still needed for them to be included in our kinematic models, their sample was cross-matched with Gaia-DR1. Five objects were found to have a parallax measurement. Four of them (HD 337919, TYC 2136–2484–1, TYC 2658–31–1, TYC 1084–672–1) have Gaia-DR1 trigonometric distances above 200 pc, preventing a credible membership in βPMG (they were also rejected as βPMG candidates by Shkolnik et al. 2017). The last object, TYC 2703–706–1, was designated as a βPMG candidate by Shkolnik et al. (2017), but has a UVW position that is located at 8.9 km s⁻¹ from the central position of βPMG, and only 3.4 km s⁻¹ of that of Columba, as defined by Gagné et al. (2014). We therefore categorize it as an ambiguous member between βPMG and COL until it is investigated further.

Six more objects in the Shkolnik et al. (2017) sample have a parallax measurement from other works in the literature (Shkolnik et al. 2012; Riedel et al. 2014); five/six were already included in the list of bona fide members presented here. The remaining star, AF Psc, has a parallax measurement by van Altena et al. (1995), which seems to have been overlooked in previous studies. It was thus added to the list of bona fide members of βPMG.

Riedel et al. (2017a) presented several new M-type young moving group candidates; however none of them benefit from a parallax distance measurement either in Gaia-DR1 or elsewhere in the literature, hence they were not included in the list of bona fide members.

4.6. Calculation of the 6D Kinematics

The Galactic positions XYZ and space velocities UVW, in a right-handed system where U points toward the Galactic center, were calculated for all members by assuming Gaussian error bars in sky position, proper motion, radial velocity and parallax. A 10⁴-element Monte Carlo approach was used to propagate error bars in XYZUVW space, by adopting the standard deviation of each coordinate as its measurement error, and therefore assuming that error bars are Gaussian in XYZUVW space. The resulting list of bona fide members is presented in Table 5 with the corresponding references in Table 6, and their various designations are listed in Table 7. A list of new bona fide members confirmed in this work is given in Table 4, and their positions on the sky are displayed in Figure 1.

Table 7. List of Bona Fide Members' Designations

Main	2MASS	AllWISE	Gaia	Other
AB Doradus
2MASS J00192626+4614078	J00192626+4614078	J001926.39+461406.8	⋯	⋯
BD+54 144 A	J00455088+5458402	J004551.02+545839.6	417565757132068096	HD 4277 A, HIP 3589 A
— BD+54 144 B	⋯	⋯	⋯	HD 4277 B, HIP 3589 B
2MASS J00470038+6803543	J00470038+6803543	J004701.09+680352.2	529737830321549952	⋯
G 132–51 B	J01034210+4051158	J010342.24+405114.2	374400889126932096	⋯
— G 132–50	J01034013+4051288	J010340.24+405127.4	374400957846408192	⋯
— G 132–51 C	⋯	⋯	374400893422315648	⋯
HIP 6276	J01203226–1128035	J012032.34–112805.2	2470272808484339200	CD–12 243
G 269–153 A	J01242767–3355086	J012427.84–335510.0	5015892473055253120	⋯
— G 269–153 B	⋯	⋯	5015892473055253248	⋯

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as: Data Typeset image

**Figure 1.** Sky distribution of young association members that were used here to build the models of BANYAN Σ. The Galactic plane ( $| b| \lt 15^\circ$ ) is designated with the gray region. The nearest young associations cover much larger fractions of the sky, which makes it harder to recognize their members without measuring their full 6D kinematics. Most of the young association members are located in the Southern hemisphere, with a few notable exceptions (UMA, CBER, PLE, HYA, TAU, and 118TAU). See Section 4 for more details.
Download figure:
Standard image High-resolution image

$| b| \lt 15^\circ $ — **Figure 1.** Sky distribution of young association members that were used here to build the models of BANYAN Σ. The Galactic plane ( $| b| \lt 15^\circ$ ) is designated with the gray region. The nearest young associations cover much larger fractions of the sky, which makes it harder to recognize their members without measuring their full 6D kinematics. Most of the young association members are located in the Southern hemisphere, with a few notable exceptions (UMA, CBER, PLE, HYA, TAU, and 118TAU). See Section 4 for more details.
Download figure:
Standard image High-resolution image

5. Kinematic Models of Young Associations

The bona fide members compiled in Table 5 were used to build the kinematic models of young associations considered in BANYAN Σ. Objects with total error bars on their Galactic position ( $\sqrt{{\sigma }_{X}^{2}+{\sigma }_{Y}^{2}+{\sigma }_{Z}^{2}}$ ) above 20% of their distance or on their space velocity ( $\sqrt{{\sigma }_{U}^{2}+{\sigma }_{V}^{2}+{\sigma }_{W}^{2}}$ ) above 8 km s⁻¹ were excluded. Companions in binary systems were ignored, and the spatial-kinematic position of binary systems was approximated as that of the primary star to avoid the need for model-dependent mass estimates.

A rejection algorithm based on minimum spanning trees (MSTs; e.g., see Allison et al. 2009; Gagné et al. 2015c) was used to ignore outliers in spatial and kinematic space. Groups that required adopting average radial velocities or distances for some high-likelihood members were exempted from this rejection step because of their small number of bona fide members (118TAU, CRA, EPSC, ETAC, PL8, ROPH, TAU, THOR, UCRA, and XFOR). If these groups are contaminated by outliers that originate from larger spatial or kinematic distributions, this exemption will result in models that are artifically biased to larger sizes, therefore increasing the rate of contamination that these groups may be subject to. The discovery of more members will be necessary to better assess and correct this effect. A spanning tree is built by connecting each star of a young association in XYZ or UVW space with straight lines while avoiding loops; the MST is the spanning tree with the shortest total length. MSTs provide a measurement of scale that does not depend on the shape of a distribution, and therefore does not require making the assumption that the stars are normally distributed, or aligned with the XYZUVW axes.

For each association containing a total of N_k members, the algorithm of Cartwright & Whitworth (2004) was used to build N_k+1 MSTs, in both XYZ and UVW spaces separately. The first spatial and kinematic MSTs with respective total lengths L_spa and L_kin include all members, and the N_k additional MSTs ignore one member at a time, and have respective total lengths ${L}_{i,\mathrm{spa}}$ and ${L}_{i,\mathrm{kin}}$ .

Relative lengths ${L}_{i,\mathrm{rel}}$ of MSTs ignoring one star each were then calculated, and an arbitrary threshold L_t was set at a value 5% smaller than the 90% percentile value ${\langle {L}_{i,\mathrm{rel}}\rangle }_{90 \% }$ of all the relative MST lengths:

$\begin{eqnarray*}{L}_{i,\mathrm{rel}} & = & \sqrt{{\left(\displaystyle \frac{{L}_{i,\mathrm{spa}}}{{L}_{\mathrm{spa}}}\right)}^{2}+{\left(\displaystyle \frac{{L}_{i,\mathrm{kin}}}{{L}_{\mathrm{kin}}}\right)}^{2}},\\ {L}_{{\rm{t}}} & = & 0.95{\langle {L}_{i,\mathrm{rel}}\rangle }_{90 \% }.\end{eqnarray*}$

Any individual MST with relative length ${L}_{i,\mathrm{rel}}\lt {L}_{{\rm{t}}}$ indicates that removing a single star i significantly shortens the spatial and/or kinematic size of the young association, and therefore that star i is an outlier. Such outliers were removed, and an iterative rejection process with a more conservative arbitrary threshold ${L}_{{\rm{t}}}=0.9{\langle {L}_{i,\mathrm{rel}}\rangle }_{90 \% }$ was subsequently used until no stars are rejected.

In a few cases, the bona fide members that survived the MST rejection displayed a clumped distribution of members with some remaining outliers that were visually easy to recognize. They typically survived the MST selection cuts because they are located near at least one other outlier. As a consequence, the 1D projections of the multivariate Gaussian models compared to the location of bona fide members (see Figure 2) were visually inspected to impose additional rejection criteria on the distribution of true members. These criteria are listed in Table 8.

Table 8. Visual Selection Cuts Applied to Moving Groups with Noticeable Outliers

Association	Rejection Criterion
ABDMG	V > −24 km s⁻¹
βPMG	V > −13 km s⁻¹
	W > −4 km s⁻¹
CBER	W < −3 km s⁻¹
ETAC	Z < −37 pc
HYA	V < −22 km s⁻¹
LCC	U < −15 km s⁻¹
PLE	U > 0 km s⁻¹
THA	Z > −20 pc
	U < 12.5 km s⁻¹
	−4 km s⁻¹ < W < 2 km s⁻¹
THOR	U < −25 km s⁻¹
USCO	U > 10 km s⁻¹

Download table as: ASCII Typeset image

**Figure 2.**
Multivariate Gaussian model of TWA. The 1, 2 and 3σ projected contours of the multivariate Gaussian model are displayed as orange lines, and black points represent individual bona fide members. The residuals resulting from the difference of a 2D kernel density estimate distribution using Silverman's rule of thumb and the multivariate Gaussian models are displayed in the background of each 2D projection. Green shades indicate an over-density of members compared to the model, and blue shades indicate an under-density. Unidimensional distributions (i.e., histograms) of the bona fide members are displayed as green bars. The thick black line represents a 1D kernel density estimate using Silverman's rule, and the thick orange line represents the projection of the multivariate Gaussian model. In the tridimensional projection figures, a single 1σ contour of the multivariate Gaussian model is displayed. Projections of the bona fide members' positions on the three axis planes are displayed with blue spheres to facilitate viewing. The complete figure set (27 images), one for each young association, is available in the online journal. See Section 3.1 for more details. (The complete figure set (27 images) is available.)
Download figure:
Standard image High-resolution image

The bona fide members that were rejected from the model construction are listed in Table 3. An average of three objects (typically less than eight objects) were rejected in each young association. Only the PLE and HYA had more rejected members (10 and 15, respetively), but given their large number of members this represents less than 8% of their populations. These objects could still be true members of their respective associations with low-quality or inaccurate kinematic measurements, therefore we do not consider that they are necessarily non-members. The exclusion of such outliers will avoid biasing the spatial and kinematic sizes of the BANYAN Σ models to artificially large values, which would result in larger rates of contamination, as long as they are either non-members or suffer from inaccurate or low-quality kinematic measurements. If some of them are true members of a young association, it is likely that other objects with similar kinematics exist and are not currently known. In such a case, the gradual discovery of moving group members at the edges of the current models with BANYAN Σ, or other methods to identify new moving groups (e.g., see Oh et al. 2017) will make it possible to uncover them. Careful searches using BANYAN Σ with the kinematics-only mode described in Section 3.9 will also make it possible to identify such groups of previously unrecognized members that are outside of the spatial dimensions of the current models, but not outside of their kinematic dimensions. Such investigations are left for future work, and all members that are rejected here will be ignored in what follows.

The weighted averages ${Q}_{,\mathrm{avg}}$ and covariances C_ij of all spatial-kinematic coordinates {Q_i} were calculated, where the weight of a given star is set to the inverse square of its spatial and kinematic error bars ${\sigma }_{k,\mathrm{spa}}$ and ${\sigma }_{k,\mathrm{kin}}$ relative to the association averages ${\sigma }_{\mathrm{avg},\mathrm{spa}}$ and ${\sigma }_{\mathrm{avg},\mathrm{kin}}$ , added in quadrature:

$\begin{eqnarray*}{Q}_{i,\mathrm{avg}} & = & \displaystyle \frac{{\displaystyle \sum }_{k}{w}_{k}{Q}_{i}}{{w}_{\mathrm{tot}}},\\ {C}_{{ij}} & = & \displaystyle \frac{{\displaystyle \sum }_{k}{w}_{k}({Q}_{i}-{Q}_{i,\mathrm{avg}})({Q}_{j}-{Q}_{j,\mathrm{avg}})}{{w}_{\mathrm{tot}}\left({w}_{\mathrm{tot}}^{2}-{\displaystyle \sum }_{k}{w}_{k}^{2}\right){\left({\displaystyle \sum }_{k}{w}_{k}^{2}\right)}^{-1}},\\ {w}_{k} & = & {\left({\left(\displaystyle \frac{{\sigma }_{k,\mathrm{spa}}}{{\sigma }_{\mathrm{avg},\mathrm{spa}}}\right)}^{2}+{\left(\displaystyle \frac{{\sigma }_{k,\mathrm{kin}}}{{\sigma }_{\mathrm{avg},\mathrm{kin}}}\right)}^{2}\right)}^{-1},\\ {w}_{\mathrm{tot}} & = & {\displaystyle \sum }_{k}{w}_{k}.\end{eqnarray*}$

The values of weights were set to a maximum of w_k < 50, corresponding to measurement errors 10 times more precise than the association average, to avoid the possibility of a very small number of precise XYZUVW measurements bearing too much weight in the kinematic models.

The covariance matrix $\bar{\bar{{\rm{\Sigma }}}}$ and center vector $\bar{\tau }$ of an association were then built from its components ${Q}_{i,\mathrm{avg}}$ and C_ij, and the covariance matrix was regularized¹⁷ to avoid numerical problems. This was done through a singular value decomposition of the covariance matrix:

$\begin{eqnarray*}&&\bar{\bar{{\rm{\Sigma }}}}={\bar{\bar{{\rm{\Sigma }}}}}_{U}{\bar{\bar{{\rm{\Sigma }}}}}_{\mathrm{sv}}{\bar{\bar{{\rm{\Sigma }}}}}_{V}^{{\rm{T}}},\end{eqnarray*}$

where ${\bar{\bar{{\rm{\Sigma }}}}}_{\mathrm{sv}}$ is a diagonal matrix containing the singular values. If the determinants $| {\bar{\bar{{\rm{\Sigma }}}}}_{U}|$ or $| {\bar{\bar{{\rm{\Sigma }}}}}_{V}|$ were found to be negative, random noise with a standard deviation equal to half the error bars was added to the XYZUVW coordinates of all members until $\bar{\bar{{\rm{\Sigma }}}}$ was found to be nonsingular. If an association contained less than 30 members, the three spatial and three kinematic singular values were forced to a minimum of 1 pc and 0.2 km s⁻¹, respectively.

Because the addition of noise in the regularization process can break the symmetry of the covariance matrix, the non-diagonal elements of the covariance matrix are forced to be symmetric by setting the values of both Σ_ij and Σ_ji to the average $({{\rm{\Sigma }}}_{{ij}}+{{\rm{\Sigma }}}_{{ji}})/2$ . In a final step, the non-diagonal elements Σ_ij were forced to values within the range $\pm \sqrt{{{\rm{\Sigma }}}_{{ii}}{{\rm{\Sigma }}}_{{jj}}}(1-{10}^{-5})$ to respect the properties of a covariance matrix. These steps did not cause any of the covariance matrices to become singular again.

The regularization of the covariance matrix is necessary to ensure that the marginalization integrals solved in Section 3.3 converge. Ill-defined covariance matrices with a negative determinant would cause the analytical solution to diverge.

The resulting moving group models are displayed in Figure 2, and their parameters are listed in Tables 1 and 9. The off-diagonal elements of the covariance matrices are provided in a FITS file with the BANYAN Σ algorithm (Gagné 2018a, 2018b). A kernel density estimate distribution was built for the 6D distribution of bona fide members using Silverman's rule of thumb, i.e., each data point is represented with a zero-covariance 6D multivariate Gaussian where the diagonal elements of the covariance matrix are given by:

$\begin{eqnarray}{{\rm{\Sigma }}}_{{ii}} & = & {(2{N}_{k})}^{-1/5}{\sigma }_{i}^{2},\\ {{\rm{\Sigma }}}_{{ij}} & = & 0,i\ne j,\end{eqnarray} \tag{ 21 }$

where σ_i is the standard deviation of the dimension i of the members' positions, and N_k is the total number of members. The 1D projections of the kernel density estimate distribution are shown in the panels of Figure 2 that display histograms of the members' positions, and the residual difference between the multivariate Gaussian models and the 2D projections of the kernel density estimate distributions are shown in blue- and green-shaded backgrounds with the 2D distributions of members. There are several cases (such as TWA) where a number of projections show over- and under-densities of members by up to ≈40% of the multivariate Gaussian model, but we recommend against using multivariate Gaussian mixture models that would more correctly reproduce the distribution of known members until a large number of members are known (e.g., with the release of Gaia-DR2). Modeling the currently significantly incomplete distributions of association members with more complex models would negatively affect the ability of BANYAN Σ to recover the missing members that are located between the clumps of currently known members (i.e., in the blue-shaded regions in Figure 2).

Table 9. Parameters for the Central Location and Variances of the Multivariate Gaussian Models of Young Associations

Asso.	$\langle X\rangle$	$\langle Y\rangle$	$\langle Z\rangle$	$\langle U\rangle$	$\langle V\rangle$	$\langle W\rangle$	${{\rm{\Sigma }}}_{00}^{1/2}$	${{\rm{\Sigma }}}_{11}^{1/2}$	${{\rm{\Sigma }}}_{22}^{1/2}$	${{\rm{\Sigma }}}_{33}^{1/2}$	${{\rm{\Sigma }}}_{44}^{1/2}$	${{\rm{\Sigma }}}_{55}^{1/2}$

	(pc)			(km s⁻¹)			(pc)			(km s⁻¹)
118TAU	−102.3	−4.8	−9.9	−12.8	−19.1	−9.2	12.7	2.4	1.8	2.1	2.8	1.6
ABDMG	−6.0	−7.2	−8.8	−7.2	−27.6	−14.2	21.4	20.3	16.3	1.4	1.0	1.8
βPMG	4.1	−6.7	−15.7	−10.9	−16.0	−9.0	29.3	14.0	9.0	2.2	1.2	1.0
CAR	6.7	−50.5	−15.5	−10.66	−21.92	−5.48	10.0	18.1	12.6	0.67	1.02	1.01
CARN	0.7	−28.1	−4.3	−25.3	−18.1	−2.3	7.8	20.8	17.3	3.2	1.9	2.0
CBER	−6.0	−5.1	84.9	−2.30	−5.51	−0.61	3.3	3.3	4.5	0.53	0.44	0.71
COL	−25.9	−25.9	−21.4	−11.90	−21.28	−5.66	12.1	23.0	17.8	1.04	1.29	0.75
CRA	132.45	−0.21	−42.43	−3.7	−15.7	−8.8	3.71	0.75	2.04	1.3	2.2	2.2
EPSC	49.9	−84.8	−25.6	−9.9	−19.3	−9.7	2.5	3.6	4.0	1.6	2.2	2.0
ETAC	33.65	−81.36	−34.81	−10.0	−22.3	−11.7	0.65	0.98	0.71	1.6	2.8	1.8
HYA	−38.5	0.8	−15.8	−42.27	−18.79	−1.47	7.4	4.4	2.9	2.01	0.94	1.10
IC2391	1.9	−148.1	−18.0	−23.04	−14.89	−5.48	1.3	6.4	1.4	1.10	3.40	0.78
IC2602	47.4	−137.6	−12.6	−8.22	−20.60	−0.58	1.5	5.4	1.1	1.18	2.61	0.65
LCC	54.3	−94.2	5.8	−7.8	−21.5	−6.2	11.9	12.4	13.7	2.7	3.8	1.8
OCT	4.0	−96.9	−59.7	−13.7	−3.3	−10.1	78.3	25.8	8.8	2.4	1.3	1.4
PL8	10.6	−124.5	−13.9	−11.01	−22.89	−3.59	7.0	11.6	4.5	1.15	1.96	0.74
PLE	−118.9	28.5	−54.4	−6.7	−28.0	−14.0	7.7	3.5	4.2	1.7	1.8	1.2
ROPH	124.79	−15.23	37.60	−5.9	−13.5	−7.9	1.33	0.51	0.66	1.3	4.7	4.3
TAU	−116.3	6.7	−35.9	−14.3	−9.3	−8.8	11.4	10.8	10.1	3.1	4.5	3.4
THA	5.4	−20.1	−36.1	−9.79	−20.94	−0.99	19.4	12.4	3.8	0.87	0.79	0.72
THOR	−88.4	−25.7	−23.9	−12.8	−18.8	−9.0	4.1	6.9	5.1	2.2	2.2	2.0
TWA	14.4	−47.7	22.7	−11.6	−17.9	−5.6	12.2	9.7	3.9	1.8	1.8	1.6
UCL	107.5	−60.9	26.5	−4.7	−19.7	−5.2	21.0	19.6	13.5	3.8	3.0	1.7
UCRA	142.1	−1.2	−39.2	−3.7	−17.1	−8.0	7.3	2.4	5.9	3.0	1.8	1.2
UMA	−7.5	9.9	21.9	14.8	1.8	−10.2	3.1	1.5	1.1	1.0	1.2	2.6
USCO	121.2	−17.0	48.9	−4.9	−14.2	−6.5	17.0	8.2	8.9	3.7	3.2	2.3
XFOR	−27.1	−46.3	−84.2	−12.54	−22.24	−6.26	4.7	3.8	4.4	0.96	1.41	2.21

Note. The average Galactic positions $\langle X\rangle$ , $\langle Y\rangle$ and $\langle Z\rangle$ and space velocities $\langle U\rangle$ , $\langle V\rangle$ and $\langle W\rangle$ correspond to the components of the $\bar{\tau }$ vector defining the center of the multivariate Gaussian model. Parameters Σ₀₀ through Σ₅₅ represent the diagonal elements of the covariance matrix, and correspond to the dispersion of Galactic positions and space velocities along the principal axes of the multivariate Gaussian, which are not necessarily aligned with the Galactic coordinates axes. See Section 5 for more details. The covariances of the multivariate Gaussian models are given in a FITS file format with the BANYAN Σ codes (Gagné, 2018a, 2018b).

Download table as: ASCII Typeset image

The distance and radial velocity distributions of young moving group models were built by drawing 10⁵ synthetic objects from their spatial multivariate Gaussian model. The average distance or radial velocity was taken as the peak value of the resulting distribution and asymmetric characteristic widths covering half of 68% of the area under the curve on each side were measured. The resulting distance and radial velocity distributions as a function of the association ages are listed in Table 1 and displayed in Figures 3(a) and (b). These figures illustrate the range in distances and radial velocities where new moving group members of a given age can likely be discovered using BANYAN Σ.

**Figure 3.** Distance, radial velocity, spatial size and kinematic scatter of BANYAN Σ association models as a function of age. Associations of the solar neighborhood provide individual epochs that cover a large period of ages, relevant to disk evolution, planetary formation, and brown dwarfs' atmospheric cooling. Associations in our sample seem to display an increasing spatial size as a function of age, except for the denser open clusters. TAU is also an exception because its model includes several sub-groups. See Section 5 for more details.
Download figure:
Standard image High-resolution image

In Figures 3(c) and (d), the characteristic spatial and kinematic scales S_spa and S_kin are displayed for different associations as a function of age, with:

$\begin{eqnarray*}{S}_{\mathrm{spa}} & = & \sqrt{| {{\boldsymbol{\Sigma }}}_{\mathrm{spa}}{| }^{1/3}},\\ {S}_{\mathrm{kin}} & = & \sqrt{| {{\boldsymbol{\Sigma }}}_{\mathrm{kin}}{| }^{1/3}},\end{eqnarray*}$

where ${{\boldsymbol{\Sigma }}}_{\mathrm{spa}}$ and ${{\boldsymbol{\Sigma }}}_{\mathrm{kin}}$ are 3 × 3 matrices that contain only the purely spatial or kinematic terms of the covariance matrix, respectively. The values of S_spa and S_kin are listed for each young association in Table 1.

Figure 3(c) illustrates how clusters are spatially much smaller than other types of associations, which become more dispersed as they age. The velocity dispersion of associations included in BANYAN Σ, as displayed in Figure 3(d), does not show a clear correlation with age. TAU is a clear outlier in both figures because a single multivariate Gaussian model is used to represent all of the TAU sub-groups.

The distribution of associations in the Galactic plane is displayed in Figures 4 and 5. The spatial location of the associations are defined as the contour that encompasses 68% of the projected multivariate Gaussian model, with an arbitrary minimum minor axis set at 4 pc for display. This figure illustrates how several associations are spatially close to each other, and how some of the nearest ones (βPMG, ABDMG) encompass the Sun. OCT is spatially the largest association because its members are spatially distributed in two distinct clumps, even though they share the same kinematics. Here we leave OCT as a single group because this may allow for the identification of new OCT members located spatially between the two clumps of known members.

**Figure 4.** Distribution in Galactic coordinates X and Y of all moving group models constructed in Section 3.1. The models included in BANYAN Σ cover all known associations and star-forming regions within 150 pc. This figure is an update of Figure 8 in Rice et al. (2011), although it is limited to 150 pc instead of 200 pc. See Section 5 for more details.
Download figure:
Standard image High-resolution image

**Figure 5.** Distribution in Galactic coordinates X and Z of all moving group models constructed in Section 3.1. The color and linestyle coding is the same as that of Figure 4. See Section 5 for more details.
Download figure:
Standard image High-resolution image

6. A Model of Field Stars in the Solar Neighborhood

This section describes the construction of a kinematic model for the field hypothesis. It is based on the Besançon model (Robin et al. 1996, 2003, 2012, 2014, 2017)¹⁸ of the Galactic disk in the solar neighborhood (with a very small contribution from the Galactic halo), and uses the multivariate Gaussian formalism described in Section 3.1 for it to be compatible with the solution of the marginalization integrals developed in Section 3.3.

The Besançon Galactic model version used here follows the scheme described by Czekaj et al. (2014) for the thin disk population, which is based on their Model B (see their Table 5). In summary, thin disk stars are generated from a 3-slope initial mass function and a decreasing star formation rate, and follow the evolutionary tracks of Bertelli et al. (1994, 2008, 2009) for masses larger than 0.7M_⊙, and from Chabrier & Baraffe (1997) for lower masses. Companions in binary systems are generated with a probability function that depends on the spectral type of the primary, and follow empirical mass-ratio and semimajor axis distributions, as described by Arenou (2011). The thick disk and halo populations are simulated with the best-fitting parameters obtained in the analysis of Robin et al. (2014) based on SDSS (Alam et al. 2015) and 2MASS data, and the isochrones of Bergbusch & Vandenberg (1992).

There are two complications that prevent a correct modeling of the field star kinematics with a simple multivariate Gaussian model: (1) the distribution in Z is similar to a hyperbolic secant function, which has wider wings than a Gaussian distribution; and (2) the distributions in X and Y are approximately uniform in the solar neighborhood.

The first problem can be addressed by modeling the field hypothesis with a mixture of N multivariate Gaussian distributions:

$\begin{eqnarray*}{{ \mathcal P }}_{\mathrm{field}} & = & \displaystyle \sum _{j=1}^{N}{c}_{j}\,{{ \mathcal P }}_{j,\mathrm{field}},\\ {{ \mathcal P }}_{j,\mathrm{field}} & = & \displaystyle \frac{1}{\sqrt{{(2\pi )}^{6}| {\bar{\bar{{\rm{\Sigma }}}}}_{j}| }}\exp \left(-\displaystyle \frac{1}{2}{(\bar{Q}-{\tau }_{j})}^{T}{\bar{\bar{{\rm{\Sigma }}}}}_{j}^{-1}(\bar{Q}-{\bar{\tau }}_{j})\right),\\ \displaystyle \sum _{j=1}^{N}{c}_{j} & = & 1,\end{eqnarray*}$

which yields the solution described in Equation (12) for the probability ${ \mathcal P }(\{{O}_{i}\}| {H}_{j,\mathrm{field}})$ associated with field component j. The resulting field probability will then be:

$\begin{eqnarray*}&&{ \mathcal P }(\{{O}_{i}\}| {H}_{\mathrm{field}})=\displaystyle \sum _{j=1}^{N}{c}_{j}{ \mathcal P }(\{{O}_{i}\}| {H}_{j,\mathrm{field}}).\end{eqnarray*}$

The second problem of the approximately uniform X and Y distributions can be mitigated by artificially inflating the two diagonal elements of all covariance matrices ${\bar{\bar{{\rm{\Sigma }}}}}_{j}$ corresponding to the X and Y dimensions by a factor much larger than the typical distances which will be involved in using BANYAN Σ. The density of the field model will however need to be re-adjusted to avoid affecting the stellar density in the solar neighborhood.

The very small covariance between all XYZUVW coordinates of field stars compared to their variances (the Pearson correlation coefficient of all dimensions is smaller than 0.1) makes the problem of fitting a mixture of multivariate Gaussians significantly easier. The covariance matrices ${\bar{\bar{{\rm{\Sigma }}}}}_{j}$ can be assumed diagonal and the fitting can be done simultaneously in four one-dimensional spaces Z, U, V and W instead of a single four-dimensional space. The X and Y coordinates are ignored in a first step, as they will be approximated as uniform by using large Gaussian widths.

Multivariate Gaussians mixtures with N = 1 to 10 components were fitted to the Z, U, V, and W distributions of field stars using a Levenberg–Marquardt least-squares fit. The best-fitting models as well as the individual components of the N = 10 model are shown in Figure 6. Models with N > 6 provide a good visual fit to all Z, U, V, and W components.

**Figure 6.** Unidimensional projections of the best-fitting ten-component multivariate Gaussians mixture models for the field stars (red dashed lines) compared to the distribution of stars in the Besançon Galactic model (black solid lines). The individual components of the best-fitting models are displayed as green lines, and best-fitting models that include fewer than seven Gaussian components are displayed as blue dashed lines. Space velocity distributions are slightly asymmetric and the distribution in the Z component of the Galactic position has much wider wings than a Gaussian model. The distributions in X and Y are approximated as uniform in the solar neighborhood. See Section 6 for more details.
Download figure:
Standard image High-resolution image

In Figure 7, the reduced χ² as a function of the number of mixture components N is displayed for each dimension, and for the global fit across ZUVW. This figure shows that the goodness-of-fit does not improve significantly at N > 7 for the kinematic dimensions UVW, but Z keeps improving up to N = 9–10. The N = 10 components model was adopted, as it represents a good balance between accuracy and usability.

This simpler approach was preferred to one based on the Bayesian information criterion, as the very large number of field stars would have allowed for an arbitrarily large number of mixture components, which would make BANYAN Σ impractical to use while causing very little difference in the calculated probabilities. The simplicity of the least-squares fitting also allowed the identification of a good general solution without needing to compute resources-intensive likelihood functions based on a very large number of field stars.

The X and Y components of the field model were arbitrarily set to a large characteristic width of 1500 pc. To ensure that this did not affect the density of stars in the solar neighborhood, a Monte Carlo simulation was used to draw field objects until a ratio of stars within 300 pc to the total number of stars could be counted with a precision of less than 1%, assuming Poisson error bars. This required a total of 10⁹ synthetic stars to be drawn and yielded a ratio of 1.69 × 10⁻⁵, which was divided to the total number of objects within 300 pc in the Besançon model (7.15 × 10⁶) to re-normalize the field model. This ensures that the multivariate Gaussian mixture model has the same density of stars as the Besançon model within 300 pc. The adopted field model parameters are provided as a FITS file containing all input data used by the BANYAN Σ IDL and Python codes (Gagné 2018a, 2018b).

6.1. The Spatial Size of Proper Motion and Galactic Latitude-limited Stellar Samples

One common way to eliminate distant stars from a sample is to impose a lower limit on the total proper motion and/or Galactic latitude. The model of the nearby Galactic disk developed in this work was used to determine the efficiency of such proper motion and Galactic latitude cuts at selecting nearby stars.

A set of 200 thresholds on the magnitude of proper motion and five thresholds on the Galactic latitude were selected uniformly in the ranges 5–300 mas yr⁻¹ and 0°–40°, respectively. For each combination of thresholds, the ZUVW coordinates of 10⁷ stars were drawn randomly from the multivariate Gaussians mixture model of the Galactic disk. Because the X and Y coordinates are approximated as uniform in the solar neighborhood, they were drawn from a uniform random distribution bounded within a distance that produces a good sampling of the distance distribution of stars selected by the proper motion threshold. Bounds of ±10,000 pc/μ (mas yr⁻¹) on both X and Y were found to be adequate. All stars with proper motions and Galactic latitudes larger than the specific set of thresholds were selected, and the smallest distance that encompasses 90% of the sample was calculated.

The resulting distances encompassing 90% of a sample are displayed in Figure 8 as a function of proper motion and Galactic latitude selection cuts. This figure demonstrates that agressive cuts on proper motion (μ > 100 mas yr⁻¹) must be used to limit a sample to distances ≲500 pc. The threshold on proper motion and Galactic latitude of the BANYAN All-Sky Survey for members of young associations (Gagné et al. 2015c) only limited their sample to distances of <800 pc, much larger than the 200 pc distance limit that was used to build the model of field stars in BANYAN II (Gagné et al. 2014). This problem was mitigated by the fact that the survey focused on substellar objects well detected in 2MASS, which limited the sample to distances ≲200 pc for >M5-type objects. However, this outlines that a model of field stars that remains valid at much larger distances, such as the one developed in this section, is necessary to limit the rate of false-positives in all-sky searches for stellar members of young associations.

$| b| $ — **Figure 8.** Largest distance encompassing 90% of randomly selected field stars, as a function of lower cuts on total proper motion μ and absolute Galactic latitude $| b|$ . The criteria used in the BASS survey for young brown dwarfs (Gagné et al. 2015b, 2015c) and the search for red L dwarfs of Schneider et al. (2017) are displayed with the star and triangle symbols, respectively. This figure demonstrates that field interlopers with distances up to ≈750 pc and ≈500 pc are likely contaminating the two respective input samples. BANYAN II has a limited capability of capturing contaminants at distances further than 200 pc, as more distant stars were not included in its model of the Galactic field. See Section 6 for more details.
Download figure:
Standard image High-resolution image

7. The Choice of Bayesian Priors

The contamination and recovery rates in a sample of candidate members selected with BANYAN Σ will be dependent on the young association where a given star is classified as a likely member. The more distant associations will provide a larger recovery rate of true association members at a fixed rate of contamination, because the members are distributed on a smaller region of the sky. As a consequence, using the same Bayesian probability threshold for the candidate members of all young associations will result in samples of vastly different sizes, completion, and contamination rates.

It is possible to adjust the Bayesian priors of the young associations in a way that equalizes the contamination or recovery rates of all young associations at an arbitrary Bayesian probability threshold. These priors will be used in the Bayesian membership probability determination (see Equation (1)). The threshold P = 90% was selected here so that BANYAN Σ approaches recovery rates R_k of 50% (μ only), 68% (μ + ν), 82% (μ + ϖ), or 90% (μ + ν + ϖ) in terms of the fraction of recovered bona fide members. This decision is arbitrary, but will allow translating the BANYAN Σ probabilities to survey completeness fractions more simply. This will therefore make BANYAN Σ easier to use in searches for new members across all 27 associations.

This was done for each young association H_k with a Monte Carlo method. The XYZUVW coordinates of 10⁷ synthetic stars were drawn from the kinematic model of the association. All coordinates were transformed to sky position, proper motion, radial velocity, and distance. Gaussian random error bars of 10 mas yr⁻¹ were added to each component of the proper motion, and radial velocity and distance were assumed to be missing. Simplified BANYAN Σ probabilities ${P}^{{\prime} }({H}_{k}| \{{O}_{i}\})$ were determined for all synthetic objects by ignoring all other groups ${H}_{j},j\ne k$ :

$\begin{eqnarray*}&&{P}^{{\prime} }({H}_{k}| \{{O}_{i}\})=\displaystyle \frac{{P}^{{\prime} }({H}_{k}){P}^{{\prime} }(\{{O}_{i}\}| {H}_{k})}{{P}^{{\prime} }({H}_{\mathrm{field}}){P}^{{\prime} }(\{{O}_{i}\}| {H}_{\mathrm{field}})},\end{eqnarray*}$

where all initial priors P'(H_k) and P'(H_field) were set to unity.

A set of 10⁵ probability thresholds P_t were defined to explore how they affected the number of true positives (N_TP; association members with ${P}^{{\prime} }({H}_{k}| \{{O}_{i}\})\geqslant {P}_{t}$ ) and false negatives (N_FN; association members with ${P}^{{\prime} }({H}_{k}| \{{O}_{i}\})\lt {P}_{t}$ ).

The probability thresholds P_t were distributed along:

$\begin{eqnarray*}{P}_{t} & = & \displaystyle \frac{1}{2}\left(\displaystyle \frac{\arctan (\omega x)}{\arctan (\omega )}+1\right),\\ \omega & = & 95,\end{eqnarray*}$

where x is an array of 10⁵ uniformly distributed values in the range [−1, 1]. This produces an array of thresholds where P_t ≈ 0 and P_t ≈ 1 are especially well sampled as ω takes larger values.

Recovery rates R_k (also called "true positive rates" or TPRs) are determined as a function of threshold P_t, with:

$\begin{eqnarray*}&&{R}_{k}=\displaystyle \frac{{N}_{\mathrm{TP}}}{{N}_{\mathrm{TP}}+{N}_{\mathrm{FN}}},\end{eqnarray*}$

where N_TP is the number of synthetic stars originating from the model of association association H_k with P_k ≥ P_t, and N_FN is the number with P_k < P_t.

For each young association, the probability threshold ${P}_{t,\mathrm{crit}}$ that generates the desired recovery rate (50%–90% depending on observables) was then selected, and a multiplicative factor α_k that ensures ${P}_{t,\mathrm{crit}}=90 \%$ when included to the Bayesian prior is then determined:

$\begin{eqnarray*}&&{\alpha }_{k}=\displaystyle \frac{0.9(1-{P}_{t,\mathrm{crit}})}{(1-0.9){P}_{t,\mathrm{crit}}}.\end{eqnarray*}$

To avoid biasing the young association probabilities of ambiguous candidate members, an average of each young association factor α_k, weighted by the individual membership probabilities $P({H}_{k}| \{{O}_{i}\})$ , is used to determine the field prior:

$\begin{eqnarray*}P({H}_{k}) & = & 1,\\ P({H}_{\mathrm{field}}) & = & {\left(\displaystyle \frac{{\displaystyle \sum }_{k}^{{\prime} }{\alpha }_{k}P({H}_{k}| \{{O}_{i}\})}{{\displaystyle \sum }_{k}^{{\prime} }P({H}_{k}| \{{O}_{i}\})}\right)}^{-1},\end{eqnarray*}$

where the sum ${\sum }_{k}^{{\prime} }$ excludes k = field. The quantities α_k are fixed for all candidate stars, but P(H_field) has to be computed for each star, since it depends on $P({H}_{k}| \{{O}_{i}\})$ .

This choice of priors ensures that the recovery rates are similar (between 50% and 90% depending on the available observables) across all young associations when a probability threshold P = 90% is adopted, without biasing the relative young association membership probabilities. However, the false-positive rates at P = 90% will be different for each association, and are discussed in Section 8. The resulting α_k values are listed in Table 1.

8. The Performance of Banyan Σ as a Bayesian Classifier

This section describes the classification performance of BANYAN Σ (in normal mode and in UVW-only mode), and compares it to those of the convergent point tool (Jones 1971; de Bruijne 1999; Mamajek 2005; Rodriguez et al. 2011), BANYAN I (Malo et al. 2013), BANYAN II (Gagné et al. 2014), LACEwING (Riedel et al. 2017b) and the ${ \mathcal M }$ -value metric introduced by Bowler et al. (2017). The probabilities or goodness-of-fit metrics of these different tools cannot be compared in absolute terms; however their rate of contamination as a function of the rate of true member recovery are physically meaningful and can be directly compared.

The 7.15 × 10⁶ objects within 300 pc from the Besançon Galactic model and the bona fide members listed in Table 5 were used to determine the classification performance of each tool. In each case, the membership probability that each object belongs to a group H_k or the field H_field were calculated using only sky position and proper motion, and the number of recovered true members ${N}_{k,\mathrm{TP}}$ from H_k was counted for a range of thresholds P_t, with:

$\begin{eqnarray}&&\displaystyle \frac{{P}_{k}}{{P}_{k}+{P}_{\mathrm{field}}}\geqslant {P}_{t},\end{eqnarray} \tag{ 22 }$

or P_k ≥ P_t for the tools that do not have a field hypothesis.

The number of false positives ${N}_{k,\mathrm{FP}}$ was defined as the number of Besançon objects that respect the same criterion. This particular way of normalizing probabilities ignores cross-contamination between young groups, and instead focuses on the contamination from field stars. These quantities make it possible to build receiver operating characteristic (ROC) curves, defined as the true positives rate TPR $=\,{N}_{k,\mathrm{TP}}/{N}_{k}$ as a function of the false positives rate $\mathrm{FPR}={N}_{k,\mathrm{FP}}/{N}_{\mathrm{field}}$ . The straight line defined by TPR = FPR corresponds to the performance of a random classification, and a ROC curve that is farthest above TPR > FPR corresponds to an optimal classification performance.

The ROC curve of each classification tool was built by taking the sum of ${N}_{k,\mathrm{TP}}$ for only the young associations that are common to all tools, interpolated on a fixed array of ${N}_{k,\mathrm{FP}}$ . These young associations considered by all tools are βPMG, THA, COL, TWA, and ABDMG. The resulting ROC curves are displayed in Figure 9(a). The ROC curves do not compare the particular thresholds of each different tool, which are defined in different ways, but rather compares their astrophysically meaningful TPRs as a function of FPRs, which can be directly compared in an informative way. BANYAN Σ achieves a performance slightly better than BANYAN II and BANYAN I, especially at large true-positive rates (TPR > 0.6). It is likely that the lack of a field model in LACEwING, the convergent point tool, and the ${ \mathcal M }$ -value is the main reason they do not perform as well as the BANYAN tools under this metric. BANYAN Σ in UVW-only mode achieves a much lower performance under this metric, but performs better than the ${ \mathcal M }$ -value and the convergent point tool.

**Figure 9.** Panel (a): receiver operating characteristic curves of different membership classification tools for distinguishing members of βPMG, THA, COL, TWA, and ABDMG from field objects based on only sky position and proper motion, and ignoring cross-contamination between associations. CP designates the convergent point tool. BANYAN Σ achieves the best performance under this metric, especially in the region TPR > 0.6. Individual values of the probability or goodness-of-fit metric thresholds are indicated along ROC curves. Panel (b): receiver operating characteristic curves of different membership classification tools for distinguishing members of the same five associations, using only sky position and proper motion, and ignoring field contamination. Classification tools that use more complex kinematic models of young associations such as BANYAN Σ tend to perform better under this metric. See Section 8 for more details.
Download figure:
Standard image High-resolution image

The resulting FPRs of all young associations are displayed in Figure 10(a) for each configuration of BANYAN Σ. In Figure 10(b), the FPRs in the case where only sky position and proper motion are used are displayed as a function of the characteristic angular size of the young associations, defined as their characteristic spatial size S_spa (see Section 5) divided by their distance. The FPRs are dominated by the characteristic angular size of young associations: the nearby and large associations that cover a significant area of the sky are much harder to distinguish from field stars, because there is a much larger set of field stars that can match their kinematics by pure chance. HYA and UMA suffer from much less contamination that would be expected given their characteristic angular size; this is due to their average kinematics that significantly differ from most field stars and from other young associations.

Another set of ROC curves was built in a similar way to measure the cross-contamination performance, by ignoring the field hypothesis and defining false-positives as members recovered in association H_k that originate from another association ${H}_{l},l\ne k$ . The resulting ROC curves are displayed in Figure 9(b). The classification tools that use more complex kinematic models tend to perform better under this metric: BANYAN Σ achieves the best performance, followed by BANYAN II, BANYAN I, the convergent point tool, BANYAN Σ in UVW-only mode, the ${ \mathcal M }$ -value, and LACEwING. The kinematic models of LACEwING are similar to those of BANYAN II, and its lower performance may instead be related to approximations that are done when transforming the Nσ metrics taken in sky position and proper motion space to probabilities directly. In particular, most of the LACEwING cross-contamination is due to confusion between ABDMG and βPMG, the members of which span the widest distributions of sky positions and proper motions.

In order to characterize the classification gains that are obtained by adding more observables in BANYAN Σ, field contamination ROC curves were built for each mode: (1) proper motion only, (2) proper motion and radial velocity, (3) proper motion and distance, or (4) proper motion, radial velocity, and distance. These ROC curves were built from all associations available in BANYAN Σ, and are displayed in Figure 11. This figure demonstrates that using measurements of radial velocity and distance cut down field contamination by factors of ∼10 and ∼100 respectively at a fixed recovery rate. The fact that using only a distance measurement makes BANYAN Σ about a factor of ten better compared with only a radial velocity measurement is likely a consequence of distance being useful to constrain both XYZ and UVW sets of coordinates. Radial velocity only helps to constrain UVW. This observation was already made for BANYAN II and LACEwING (Gagné et al. 2014; Riedel et al. 2017b).

**Figure 11.** Receiver operating characteristic curve for BANYAN Σ as a function of the observables used, for all groups included here. Sky coordinates were used in all cases, and cross-contamination between the groups was ignored. The addition of radial velocity and distance cut down the rate of field contamination by factors of ∼10 and ∼100 respectively at a fixed recovery rate. See Section 8 for more details.
Download figure:
Standard image High-resolution image

The models of young associations were also used to draw 10⁵ synthetic association members in order to obtain smooth TPRs as a function of probability threshold for each group in BANYAN Σ, which are displayed in Figure 12. This figure illustrates the effect of the α_k thresholds described in Section 7, causing the TPR curves of all young associations to meet at P_t = 90% and TPR = 50%. Similar curves were built for FPRs and Matthews correlation coefficients (MCCs; Matthews & Czerwinski 1975), defined in the range $[-1,1]$ indicate the quality of a Bayesian classifier: An MCC of −1 indicates a perfect mis-classification, a value of 0 indicates a performance similar to a random classification, and a value of +1 indicates a perfect classifier. In these particular simulations, the number of true positives and false negatives (i.e., all synthetic objects originating from a young association model H_k) were scaled in the range $[0,{N}_{k}]$ instead of $[0,{10}^{5}]$ , to give a realistic representations of the the FPR and MCC, where N_k represents the number of stars in each young association. Because young associations with N_k < 20 are expected to be incomplete, we have set a minimal value of N_k = 20 in these situations. In a similar way, the number of true negatives and false positives (i.e., all synthetic objects originating from the Besançon Galactic model) were scaled in the range [0, 217,680], corresponding to the number of OBAFG-type stars in the model (out of 7.15 × 10⁶ stars). In doing this, we assume that most young associations are complete at this fraction; this approximation only affects the FPR and MCC values reported in this work. The values N_k are listed for each young association in Table 1.

**Figure 12.** True positive rate (TPR) as a function of Bayesian probability threshold P_t for different young associations, when only sky position and proper motion are considered. Bayesian priors were chosen for P_t = 90% to yield TPR = 50% in this scenario (e.g., see the thick gray lines). The dashed thick gray lines represent the range in P_t for which the range of recovery rates are reported in Table 10. See Section 8 for more details.
Download figure:
Standard image High-resolution image

The resulting TPR, FPR, and MCC values for each young association, using each possible set of observables, are reported in Table 10, as the range of possible values within P_t ∈ 85%–95% and centered on P_t = 90%. They demonstrate how members of distant associations are much easier to distinguish from field objects because of their narrower distribution on the sky, with smaller FPRs and larger MCCs. Both the IDL and Python implementations of BANYAN Σ specify as an output for each star the TPR and FPR of a sample that would be constructed with only candidate members that have membership probabilities equal or higher than those of the star in question. This will allow users to interpret the Bayesian probabilities more easily without needing to rely on Table 10.

We investigated the effect of ignoring covariances between the input observables (see Section 3.3) with a 10⁴-element Monte Carlo simulation. The median proper motion and parallax errors of all bona fide members in Gaia-DR1 (respectively 0.2 and 0.15 mas yr⁻¹, and 0.3 mas) were assigned as the measurement errors to the observables of a typical member (TWA 1). The radial velocity of TWA 1 was ignored to maximize the effect of the covariances in the other measurements, and therefore to be as conservative as possible. Random measurements of its proper motion and parallax were taken from a 3D multivariate Gaussian distribution, where each dimension corresponds to the two components of proper motion and the parallax, and assuming a 99.9% correlation between all dimensions. Each of these synthetic stars was assigned a vanishingly small error on its proper motion and distance (which we set to the median Gaia-DR1 values divided by one hundred). The membership probabilities of the 10⁴ synthetic stars were then calculated with BANYAN Σ, and the average probability was calculated, which corresponds to the final probability marginalized over all values of proper motion and parallax. Ignoring the covariances in the observed measurements resulted in a negligible difference of ∼0.09% in the membership probability, and similarly negligible differences of respectively 0.2% and 0.01% in the measured optimal distance and radial velocity. A similar Monte Carlo analysis was performed where the distance measurement was ignored, and yielded even smaller differences: 3 × 10⁻⁵% in probability, 3 × 10⁻⁶% in optimal distance and 8 × 10⁻⁶% in optimal radial velocity. The effects of ignoring covariances between the components of sky position and other quantities would be even smaller, because the error bars on the sky position are always significantly smaller than those on proper motion and parallax.

Table 10. BANYAN Σ Classification Performance for Different Young Associations as a Function of Input Observables

Asso.	TPR_crit				log₁₀ FPR_crit				log₁₀ MCC_crit
	μ	μ, ν	μ, ϖ	μ, ν, ϖ	μ	μ, ν	μ, ϖ	μ, ν, ϖ	μ	μ, ν	μ, ϖ	μ, ν, ϖ
118TAU	${0.5}_{-0.1}^{+0.3}$	${0.7}_{-0.1}^{+0.2}$	${0.82}_{-0.05}^{+0.10}$	${0.90}_{-0.03}^{+0.07}$	$-{4.9}_{-0.2}^{+0.3}$	−5.6 ± 0.2	$-{5.9}_{-0.2}^{+0.3}$	$-{6.4}^{+0.5}$	$-{1.39}_{-0.01}^{+0.20}$	$-{0.88}_{-0.02}^{+0.08}$	$-{0.70}_{-0.06}^{+0.07}$	$-{0.43}_{-0.01}^{+0.10}$
ABDMG	${0.5}_{-0.1}^{+0.2}$	${0.68}_{-0.07}^{+0.10}$	${0.82}_{-0.06}^{+0.20}$	${0.90}_{-0.03}^{+0.09}$	$-{2.8}_{-0.2}^{+0.4}$	$-{3.5}_{-0.2}^{+0.3}$	$-{4.0}_{-0.2}^{+0.3}$	$-{4.5}_{-0.1}^{+0.3}$	$-{2.24}_{-0.01}$	$-{1.78}_{-0.05}^{+0.07}$	$-{1.43}_{-0.05}^{+0.07}$	$-{1.14}_{-0.06}^{+0.08}$
BPMG	${0.5}_{-0.1}^{+0.2}$	${0.68}_{-0.07}^{+0.10}$	${0.82}_{-0.06}^{+0.10}$	${0.90}_{-0.03}^{+0.08}$	$-{2.6}_{-0.2}^{+0.4}$	$-{3.2}_{-0.2}^{+0.3}$	$-{4.4}_{-0.1}^{+0.3}$	$-{4.8}_{-0.2}^{+0.3}$	$-{2.39}_{-0.02}$	$-{1.95}_{-0.04}^{+0.06}$	$-{1.28}_{-0.05}^{+0.06}$	−1.03 ± 0.09
CAR	${0.5}_{-0.1}^{+0.2}$	${0.68}_{-0.09}^{+0.20}$	${0.82}_{-0.05}^{+0.10}$	${0.90}_{-0.03}^{+0.07}$	$-{2.9}_{-0.2}^{+0.5}$	$-{3.6}_{-0.2}^{+0.4}$	$-{5.0}_{-0.2}^{+0.4}$	$-{5.3}_{-0.1}^{+0.5}$	$-{2.36}^{+0.06}$	$-{1.93}_{-0.04}^{+0.03}$	$-{1.12}_{-0.08}^{+0.10}$	$-{0.96}_{-0.04}^{+0.20}$
CARN	${0.5}_{-0.1}^{+0.2}$	${0.68}_{-0.08}^{+0.20}$	${0.82}_{-0.06}^{+0.10}$	${0.90}_{-0.03}^{+0.09}$	$-{3.3}_{-0.2}^{+0.5}$	$-{4.0}_{-0.2}^{+0.3}$	$-{4.5}_{-0.2}^{+0.3}$	$-{4.9}_{-0.1}^{+0.3}$	$-{2.18}^{+0.03}$	$-{1.71}_{-0.05}^{+0.03}$	$-{1.39}_{-0.06}^{+0.05}$	$-{1.14}_{-0.06}^{+0.10}$
CBER	${0.5}_{-0.2}^{+0.5}$	${0.7}_{-0.1}^{+0.4}$	${0.82}_{-0.06}^{+0.20}$	${0.90}_{-0.04}^{+0.10}$	$-{4.5}_{-0.2}^{+0.6}$	$-{5.2}_{-0.2}^{+0.6}$	$-{6.9}_{-0.3}$	$\lt -7$	$-{1.40}_{-0.02}^{+0.80}$	$-{0.96}_{-0.03}^{+0.05}$	$-{0.19}_{-0.06}^{+0.08}$	$-{0.17}_{-0.01}^{+0.02}$
COL	${0.50}_{-0.08}^{+0.10}$	${0.68}_{-0.05}^{+0.09}$	${0.82}_{-0.05}^{+0.10}$	${0.90}_{-0.03}^{+0.06}$	$-{2.7}_{-0.1}^{+0.3}$	$-{3.2}_{-0.1}^{+0.2}$	$-{4.5}_{-0.1}^{+0.3}$	$-{4.8}_{-0.1}^{+0.2}$	$-{2.46}_{-0.01}$	$-{2.08}_{-0.03}^{+0.05}$	$-{1.33}_{-0.05}^{+0.07}$	$-{1.13}_{-0.05}^{+0.08}$
CRA	${0.5}_{-0.1}^{+0.2}$	${0.68}_{-0.09}^{+0.20}$	${0.82}_{-0.06}^{+0.20}$	${0.90}_{-0.03}^{+0.09}$	$-{5.06}_{-0.09}^{+0.30}$	−5.8 ± 0.1	$-{6.55}_{-0.20}$	$\lt -7$	$-{1.31}_{-0.06}^{+0.10}$	$-{0.81}^{+0.08}$	$-{0.39}_{-0.05}^{+0.04}$	$\gt -0.22$
EPSC	${0.5}_{-0.2}^{+0.3}$	${0.7}_{-0.1}^{+0.3}$	${0.82}_{-0.06}^{+0.20}$	${0.90}_{-0.03}^{+0.10}$	$-{4.9}_{-0.2}^{+0.6}$	$-{5.3}_{-0.2}^{+0.5}$	$\lt -7$	$-{6.85}_{-0.50}$	$-{1.35}_{-0.01}^{+0.20}$	−1.00 ± 0.05	$-{0.43}_{-0.03}^{+0.10}$	$-{0.22}_{-0.20}^{+0.03}$
ETAC	${0.5}_{-0.2}^{+0.3}$	${0.7}_{-0.1}^{+0.3}$	${0.82}_{-0.07}^{+0.20}$	${0.90}_{-0.04}^{+0.10}$	$-{6.0}_{-0.2}^{+0.4}$	$-{6.37}_{-0.40}^{+0.01}$	$\lt -7$	$\lt -7$	$-{0.85}_{-0.03}^{+0.30}$	$-{0.6}_{-0.1}^{+0.2}$	$\gt -0.22$	$\gt -0.22$
HYA	${0.5}_{-0.1}^{+0.3}$	${0.68}_{-0.09}^{+0.20}$	${0.82}_{-0.06}^{+0.10}$	${0.90}_{-0.03}^{+0.09}$	$-{5.4}_{-0.2}^{+0.6}$	$-{6.2}_{-0.3}^{+0.7}$	$-{6.6}_{-0.2}^{+0.3}$	$-{6.82}_{-0.40}$	$-{0.67}_{-0.02}^{+0.05}$	$-{0.27}_{-0.06}^{+0.04}$	$-{0.12}_{-0.01}^{+0.02}$	$-{0.07}_{-0.05}^{+0.02}$
IC2391	${0.5}_{-0.1}^{+0.3}$	${0.68}_{-0.07}^{+0.20}$	${0.82}_{-0.04}^{+0.09}$	${0.90}_{-0.02}^{+0.05}$	$-{5.3}_{-0.1}^{+0.3}$	$-{5.7}_{-0.1}^{+0.3}$	$-{6.0}^{+0.4}$	$-{6.6}^{+0.3}$	$-{1.18}_{-0.03}^{+0.20}$	$-{0.88}^{+0.03}$	$-{0.66}_{-0.02}^{+0.10}$	$-{0.36}^{+0.09}$
IC2602	${0.5}_{-0.1}^{+0.3}$	${0.68}_{-0.08}^{+0.20}$	${0.82}_{-0.05}^{+0.10}$	${0.90}_{-0.02}^{+0.06}$	$-{5.0}_{-0.1}^{+0.3}$	$-{5.42}_{-0.07}^{+0.20}$	$\lt -7$	$\lt -7$	$-{1.35}_{-0.05}^{+0.30}$	$-{1.00}_{-0.01}^{+0.08}$	−0.28 ± 0.02	$\gt -0.24$
LCC	${0.5}_{-0.1}^{+0.3}$	${0.68}_{-0.09}^{+0.20}$	${0.82}_{-0.06}^{+0.10}$	${0.90}_{-0.03}^{+0.08}$	$-{3.2}_{-0.2}^{+0.4}$	$-{3.5}_{-0.2}^{+0.3}$	$-{4.5}_{-0.2}^{+0.3}$	−4.6 ± 0.2	$-{1.91}_{-0.02}^{+0.10}$	$-{1.62}_{-0.03}^{+0.02}$	$-{1.06}_{-0.06}^{+0.07}$	−0.97 ± 0.07
OCT	${0.50}_{-0.06}^{+0.10}$	${0.68}_{-0.03}^{+0.05}$	${0.82}_{-0.02}^{+0.04}$	${0.90}_{-0.01}^{+0.02}$	$-{3.0}_{-0.1}^{+0.2}$	$-{3.22}_{-0.09}^{+0.20}$	$-{3.43}_{-0.09}^{+0.20}$	$-{3.38}_{-0.05}^{+0.10}$	$-{2.31}^{+0.01}$	$-{2.09}_{-0.03}^{+0.04}$	$-{1.91}_{-0.03}^{+0.06}$	$-{1.89}_{-0.02}^{+0.04}$
PL8	${0.5}_{-0.1}^{+0.3}$	${0.68}_{-0.08}^{+0.20}$	${0.82}_{-0.04}^{+0.09}$	${0.90}_{-0.02}^{+0.05}$	$-{4.3}_{-0.1}^{+0.2}$	$-{4.6}_{-0.1}^{+0.2}$	$-{5.4}_{-0.1}^{+0.2}$	$-{5.7}_{-0.1}^{+0.2}$	$-{1.70}_{-0.03}^{+0.30}$	$-{1.38}^{+0.03}$	−0.94 ± 0.05	$-{0.72}_{-0.05}^{+0.08}$
PLE	${0.5}_{-0.2}^{+0.4}$	${0.7}_{-0.1}^{+0.3}$	${0.82}_{-0.05}^{+0.10}$	${0.90}_{-0.03}^{+0.09}$	$-{5.3}_{-0.3}^{+0.4}$	$-{5.7}_{-0.2}^{+0.7}$	$-{6.0}_{-0.4}^{+0.2}$	$-{6.4}_{-0.3}^{+0.5}$	$-{0.72}_{-0.01}^{+0.40}$	$-{0.43}_{-0.01}^{+0.03}$	$-{0.2}_{-0.1}$	$-{0.12}_{-0.06}^{+0.03}$
ROPH	${0.5}_{-0.1}^{+0.2}$	${0.68}_{-0.07}^{+0.20}$	${0.82}_{-0.07}^{+0.20}$	${0.90}_{-0.04}^{+0.10}$	−5.4 ± 0.2	−6.3 ± 0.1	$\lt -7$	$\lt -7$	$-{0.68}^{+0.08}$	$-{0.24}^{+0.05}$	$\gt -0.05$	$\gt -0.04$
TAU	${0.50}_{-0.08}^{+0.10}$	${0.68}_{-0.07}^{+0.10}$	${0.82}_{-0.06}^{+0.20}$	${0.90}_{-0.03}^{+0.09}$	$-{2.5}_{-0.1}^{+0.2}$	$-{3.0}_{-0.1}^{+0.2}$	$-{4.2}_{-0.2}^{+0.3}$	$-{4.5}_{-0.2}^{+0.3}$	$-{2.18}_{-0.01}^{+0.05}$	$\gt -1.81$	$-{1.14}_{-0.06}^{+0.07}$	$-{0.93}_{-0.06}^{+0.10}$
THA	${0.5}_{-0.1}^{+0.2}$	${0.68}_{-0.08}^{+0.20}$	${0.82}_{-0.05}^{+0.10}$	${0.90}_{-0.03}^{+0.07}$	$-{3.9}_{-0.2}^{+0.5}$	$-{4.5}_{-0.2}^{+0.4}$	$-{5.3}_{-0.1}^{+0.3}$	$-{5.47}_{-0.08}^{+0.20}$	$-{1.76}_{-0.03}$	$-{1.33}_{-0.06}^{+0.08}$	$-{0.84}_{-0.04}^{+0.07}$	$-{0.71}_{-0.03}^{+0.05}$
THOR	${0.5}_{-0.1}^{+0.3}$	${0.7}_{-0.1}^{+0.2}$	${0.82}_{-0.06}^{+0.20}$	${0.90}_{-0.03}^{+0.10}$	$-{3.6}_{-0.2}^{+0.5}$	$-{4.2}_{-0.2}^{+0.4}$	$-{5.7}_{-0.3}^{+0.5}$	$-{6.1}_{-0.2}^{+0.5}$	$-{1.90}_{-0.01}^{+0.10}$	$-{1.49}_{-0.03}^{+0.01}$	$-{0.66}_{-0.09}^{+0.10}$	$-{0.46}_{-0.08}^{+0.10}$
TWA	${0.5}_{-0.1}^{+0.2}$	${0.68}_{-0.09}^{+0.20}$	${0.82}_{-0.05}^{+0.10}$	${0.90}_{-0.03}^{+0.08}$	$-{4.1}_{-0.2}^{+0.5}$	$-{4.6}_{-0.2}^{+0.4}$	$-{5.5}_{-0.1}^{+0.3}$	$-{5.8}_{-0.2}^{+0.3}$	$-{1.76}^{+0.03}$	−1.37 ± 0.04	$-{0.86}_{-0.04}^{+0.07}$	$-{0.67}_{-0.08}^{+0.10}$
UCL	${0.5}_{-0.1}^{+0.3}$	${0.68}_{-0.09}^{+0.20}$	${0.82}_{-0.06}^{+0.10}$	${0.90}_{-0.03}^{+0.08}$	$-{2.9}_{-0.2}^{+0.4}$	$-{3.2}_{-0.2}^{+0.3}$	$-{3.7}_{-0.2}^{+0.3}$	$-{3.9}_{-0.1}^{+0.2}$	$-{2.05}_{-0.01}^{+0.10}$	$-{1.75}_{-0.02}$	$-{1.40}_{-0.05}^{+0.09}$	$-{1.27}_{-0.06}^{+0.08}$
UCRA	${0.5}_{-0.1}^{+0.2}$	${0.68}_{-0.06}^{+0.10}$	${0.82}_{-0.03}^{+0.07}$	${0.90}_{-0.02}^{+0.03}$	$-{4.3}_{-0.1}^{+0.3}$	$-{4.8}_{-0.1}^{+0.2}$	$-{5.2}_{-0.1}^{+0.2}$	$-{5.32}_{-0.07}^{+0.20}$	$-{1.67}_{-0.02}^{+0.10}$	$-{1.29}_{-0.02}$	$-{1.02}_{-0.05}^{+0.07}$	$-{0.92}_{-0.03}^{+0.09}$
UMA	${0.5}_{-0.1}^{+0.2}$	${0.68}_{-0.09}^{+0.20}$	${0.82}_{-0.06}^{+0.10}$	${0.90}_{-0.03}^{+0.09}$	$\lt -7$	$\lt -7$	$\lt -7$	$\lt -7$	$-{0.58}_{-0.07}^{+0.09}$	−0.33 ± 0.02	$\gt -0.21$	∼0
USCO	${0.5}_{-0.1}^{+0.2}$	${0.68}_{-0.09}^{+0.20}$	${0.82}_{-0.05}^{+0.10}$	${0.90}_{-0.03}^{+0.08}$	$-{3.7}_{-0.2}^{+0.3}$	$-{4.0}_{-0.1}^{+0.3}$	$-{4.5}_{-0.1}^{+0.3}$	$-{4.7}_{-0.1}^{+0.3}$	$-{1.70}_{-0.02}^{+0.10}$	$-{1.39}_{-0.01}^{+0.02}$	$-{1.04}_{-0.04}^{+0.08}$	$-{0.93}_{-0.04}^{+0.09}$
XFOR	${0.5}_{-0.1}^{+0.3}$	${0.68}_{-0.08}^{+0.20}$	${0.82}_{-0.04}^{+0.10}$	${0.90}_{-0.02}^{+0.05}$	$-{5.4}_{-0.1}^{+0.4}$	$-{5.9}_{-0.3}^{+0.2}$	$-{6.82}_{-0.30}$	$\lt -7$	$-{1.15}_{-0.03}^{+0.20}$	$-{0.78}_{-0.08}^{+0.06}$	$-{0.29}_{-0.08}^{+0.02}$	$-{0.25}^{+0.02}$

Note. True-positive rates (TPRs), false-positive rates (FPRs) and Matthews correlation coefficients (MCCs) are reported at the "critical" P_t = 90% threshold, and their reported ranges are for thresholds in the range 85%–95%. See Section 8 for more details.

Download table as: ASCII Typeset image

Table 11. BANYAN Σ Probabilities for New Candidate Members of Sco-Cen Identified by Rizzuto et al. (2011)

2MASS	Asso.	P
Designation		(%)
Unambiguous candidate members of USCO
15480330-2512562	USCO	96.8
15574880-2331383	USCO	99.0
16052655-1948066	USCO	99.2
16120593-2314445	USCO	98.6
16203056-2006518	USCO	99.2
16301246-2506548	USCO	97.1
Unambiguous candidate members of UCL
13514960-3259387	UCL	96.3
14170338-3432122	UCL	96.6
14250102-3726493	UCL	97.8
14281043-2929299	UCL	97.3
14353043-4209281	UCL	99.6
14353149-4131026	UCL	99.9
15025928-3238357	UCL	96.9
15130106-3714480	UCL	99.6
15183199-4752307	UCL	97.6
15195653-3006249	UCL	95.4
15342085-3920572	UCL	99.6
15383263-3909384	UCL	99.8
15500707-5312351	UCL	96.0
16003131-3605164	UCL	98.6
16040712-3510367	UCL	96.8
16312294-3442153	UCL	95.6
16383094-3909083	UCL	95.5
Unambiguous candidate members of LCC
11454479-5241258	LCC	95.1
12044525-5915117	LCC	99.8
12113912-5222065	LCC	98.4
12263615-6305571	LCC	99.8
12474326-5941194	LCC	99.7
Ambiguous candidate members
13172895-4255587	TWA,UCL,LCC	79.4,10.5,5.7
13324248-5549391	LCC,UCL	89.4,7.8
13414477-5433339	LCC,UCL	76.6,21.6
14074081-4842144	UCL,LCC	80.4,18.8
15585013-3203082	UCL,USCO	74.3,23.4
16115069-2733098	USCO,UCL	49.3,47.4
16201552-2843008	USCO,UCL	77.7,20.6

Note. See Section 9 for more details.

Download table as: ASCII Typeset image

9. Classifying Previously Ambiguous Members with Banyan Σ

In this section, BANYAN Σ is used to assign classifications to previously ambiguous members encountered in the construction of the bona fide members lists (e.g., see the discussion in Section 4). None of the stars analyzed in this section was used to construct the young association models of BANYAN Σ. Only objects with a P ≥ 95% Bayesian probability of belonging to young associations are discussed here.

TWA 19 AB is a young star that was defined as a candidate member of TWA by Mamajek (2005), but later listed as a likely contaminant from LCC by Gagné et al. (2017b) based on a preliminary version of BANYAN Σ. Here we obtain a membership probability of P = 99% that it is a member of LCC.

Zuckerman & Song (2004) listed several tentative members of ABDMG as having a "questionable membership." We find that five of them (HD 6569, HD 13482, HD 139751, HD 218860, and HD 224228) obtain a P > 97% membership probability associated with ABDMG.

2MASS J05361998–1920396 is a young L2 γ substellar object that was listed as an ambiguous member of COL and βPMG by Faherty et al. (2016). Here, we obtain an unambiguous P = 99.9% probability that it is a member of COL.

CP–68 1388, GSC 09235–01702, CD–69 1055, and MP Mus were all identified as ambiguous members of LCC or EPSC based on literature compilations. All but GSC 09235–01702 obtain an unambiguous P > 98% LCC membership probability when analyzed with BANYAN Σ. GSC 09235–01702 remains somewhat ambiguous with a 85% EPSC membership probability and a 12.7% LCC membership probability.

As mentioned in Section 4, there is a subset of new candidate members of the Sco-Cen region (consisting of the USCO, LCC and UCL sub-groups) discovered by Rizzuto et al. (2011) that were not assigned to either of its sub-groups. Of these candidate members, 35 benefit from at least a radial velocity and/or parallax measurement, and obtain a P > 95% young association membership probability. These objects are listed in Table 11 along with their respective probabilities in UCL, LCC and USCO.

Only seven of them remain ambiguous between more than one of the Sco-Cen subgroups, and they all obtain negligible membership probabilities for associations outside of the Sco-Cen region, with one exception: 2MASS J13172895–4255587 (F3 V; Houk 1978) obtains respective membership probabilities of 79.4% for TWA, 10.5% for UCL, 5.7% for LCC, and 4.3% for the field. This star has not been identified by Gagné et al. (2017b), who used a preliminary BANYAN Σ model of TWA to identify new members based on Hipparcos. Further studies of this star may be able to confirm whether its age and radial velocity match those of TWA better than the Sco-Cen region.

10. Summary and Conclusions

A new Bayesian algorithm to identify young association members was presented. It derives membership probabilities from the sky position and proper motion of an object, and optionally radial velocity, parallax, and spectrophotometric distance constraints. It includes various improvements over its predecessor BANYAN II (Gagné et al. 2014), as it includes more associations, an updated list of bona fide members including Gaia-DR1 data, better spatial-kinematic models, a more accurate model of field stars, fewer approximations, new options, and a significantly enhanced execution speed due to an analytical solution of the marginalization integrals. One limitation of BANYAN Σ is that it cannot account for correlations in measured error bars, such as those reported in Gaia, but this results in biases of less than 0.1% in membership probability and measurements of the optimal distance and radial velocity.

The new BANYAN Σ tool includes all of the 27 currently known young associations within 150 pc, for which the current census of bona fide members is updated. An online tool is also made publicly available at http://www.exoplanetes.umontreal.ca/banyan/banyansigma.php. Additional figures and information on this work can be found on the website http://www.astro.umontreal.ca/~gagne.

We strongly encourage users to investigate the position of candidate members in appropriate color–magnitude diagrams using the BANYAN Σ optimal distance when no parallax measurements are used as inputs, as candidate lists generated without them suffer from ∼100 times more false positives in comparison (see Section 8 and Figure 11). The BANYAN Σ algorithm can include distance constraints from population sequences in color–magnitude diagrams, and future versions will provide examples of such sequences.

This first version of BANYAN Σ will be the basis of a search for new isolated planetary-mass members in young associations based on 2MASS and AllWISE through the BASS-Ultracool survey (J. Gagné et al. 2018, in preparation; see also Gagné et al. 2015a, 2017a for preliminary results from BASS-Ultracool), as well as a search for new stellar members based on Gaia-DR1 (J. Gagné et al. 2018, in preparation). The release of Gaia-DR2 will allow us to significantly improve the spatial and kinematic models of BANYAN Σ, and identify new members of young associations that are not part of the Tycho catalog. A second version of the BANYAN Σ software with such improved models will be released in a future work that will aim at furthering the census of young association members based on Gaia-DR2.

The authors would like to thank the anonymous referee and the AAS statistics consultant for valuable and detailed comments that significantly improved the quality of this paper. We thank Bruno S. Alessi and Eric Bubar for sharing data. We thank Noé Aubin-Cadot, Joel Kastner, Thierry Bazier-Matte, Simon Gélinas, Jean-François Désilets and Brendan Bowler for useful comments, as well as the anonymous referee of Gagné et al. (2015b), who suggested the inclusion of a parallax motion correction in the BANYAN tools.

This research made use of: the SIMBAD database and VizieR catalog access tool, operated at the Centre de Données astronomiques de Strasbourg, France (Ochsenbein et al. 2000); data products from the Two Micron All Sky Survey (2MASS; Skrutskie et al. 2006), which is a joint project of the University of Massachusetts and the Infrared Processing and Analysis Center (IPAC)/California Institute of Technology (Caltech), funded by the National Aeronautics and Space Administration (NASA) and the National Science Foundation (Skrutskie et al. 2006); data products from the Wide-field Infrared Survey Explorer (WISE; and Wright et al. 2010), which is a joint project of the University of California, Los Angeles, and the Jet Propulsion Laboratory (JPL)/Caltech, funded by NASA. This project was developed in part at the 2017 Heidelberg Gaia Sprint, hosted by the Max-Planck-Institut für Astronomie, Heidelberg. This work has made use of data from the European Space Agency (ESA) mission Gaia (http://www.cosmos.esa.int/gaia), processed by the Gaia Data Processing and Analysis Consortium (DPAC,http://www.cosmos.esa.int/web/gaia/dpac/consortium). Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement. Part of this research was carried out at the Jet Propulsion Laboratory, Caltech, under a contract with NASA. BGM simulations were executed on computers from the Utinam Institute of the Université de Franche-Comté supported by the Région de Franche-Comté and Institut des Sciences de l'Univers (INSU).

J.G. designed BANYAN Σ, compiled the bona fide members, wrote the IDL codes, wrote the manuscript, generated figures and led all analyses; E.E.M. shared parts of the young association literature data, characteristics and bona fide members, and provided general comments; O.R.L. wrote the initial Python translation of BANYAN Σ; L.M., A.R., and D.R. performed the BANYAN I, LACEwING and convergent point tool calculations used in Section 8, respectively; A.R. also provided help with the bona fide members compilation; D.L. and J.K.F. shared ideas and comments; L.P. shared ideas and provided comments especially for the ROC curves analysis; A.R. performed custom Besançon Galactic simulations and wrote the second pagraph of Section 6; and R.D. shared comments and supervized O.R.L.

Software: BANYAN Σ (this paper), LACEwING (Riedel et al. 2017b), BANYAN II (Gagné et al. 2014), BANYAN I (Malo et al. 2013), the convergent point tool (Rodriguez et al. 2011), Notability by Ginger Labs, Sublime Text.

Appendix A: Coordinate Transformation of the Bayesian Likelihood

In this section, a change of coordinates is applied to the Bayesian likelihood ${{ \mathcal P }}_{o}(\{{O}_{i}\}| {H}_{k})$ from the direct observables frame of reference {O_i} (sky position, proper motion, etc.) to the Galactic position (XYZ) and space velocity (UVW) reference frame {Q_i}:

$\begin{eqnarray*}&&{{ \mathcal P }}_{o}(\{{O}_{i}\}| {H}_{k})\displaystyle \prod _{j}{{dO}}_{j}={{ \mathcal P }}_{q}(\{{Q}_{i}\}| {H}_{k})\displaystyle \prod _{j}{{dQ}}_{j}.\end{eqnarray*}$

This change of coordinates can be expressed in the following form:

$\begin{eqnarray}&&{{ \mathcal P }}_{o}(\{{O}_{i}\}| {H}_{k})=| \bar{\bar{J}}| \,\cdot \,{{ \mathcal P }}_{q}(\{{Q}_{i}\}| {H}_{k}),\end{eqnarray} \tag{ 23 }$

$\begin{eqnarray}&&{\rm{with}}\,{J}_{{lm}}=\displaystyle \frac{\partial {Q}_{l}}{\partial {O}_{m}},\end{eqnarray} \tag{ 24 }$

where $\bar{\bar{J}}$ is a Jacobian matrix, which can be expressed as a 2 × 2 block matrix of four 3 × 3 sub-matrices:

$\begin{eqnarray*}\bar{\bar{J}} & = & \left[\begin{array}{cc}{\boldsymbol{ \mathcal A }} & {\boldsymbol{ \mathcal B }}\\ {\boldsymbol{ \mathcal C }} & {\boldsymbol{ \mathcal D }}\end{array}\right],\\ {\boldsymbol{ \mathcal A }} & = & \left[\begin{array}{ccc}\tfrac{\partial X}{\partial \alpha } & \tfrac{\partial X}{\partial \delta } & \tfrac{\partial X}{\partial \varpi }\\ \tfrac{\partial Y}{\partial \alpha } & \tfrac{\partial Y}{\partial \delta } & \tfrac{\partial Y}{\partial \varpi }\\ \tfrac{\partial Z}{\partial \alpha } & \tfrac{\partial Z}{\partial \delta } & \tfrac{\partial Z}{\partial \varpi }\end{array}\right],\\ {\boldsymbol{ \mathcal B }} & = & \left[\begin{array}{ccc}\tfrac{\partial X}{\partial {\mu }_{\alpha }} & \tfrac{\partial X}{\partial {\mu }_{\delta }} & \tfrac{\partial X}{\partial \nu }\\ \tfrac{\partial Y}{\partial {\mu }_{\alpha }} & \tfrac{\partial Y}{\partial {\mu }_{\delta }} & \tfrac{\partial Y}{\partial \nu }\\ \tfrac{\partial Z}{\partial {\mu }_{\alpha }} & \tfrac{\partial Z}{\partial {\mu }_{\delta }} & \tfrac{\partial Z}{\partial \nu }\end{array}\right],\\ {\boldsymbol{ \mathcal C }} & = & \left[\begin{array}{ccc}\tfrac{\partial U}{\partial \alpha } & \tfrac{\partial U}{\partial \delta } & \tfrac{\partial U}{\partial \varpi }\\ \tfrac{\partial V}{\partial \alpha } & \tfrac{\partial V}{\partial \delta } & \tfrac{\partial V}{\partial \varpi }\\ \tfrac{\partial W}{\partial \alpha } & \tfrac{\partial W}{\partial \delta } & \tfrac{\partial W}{\partial \varpi }\end{array}\right],\\ {\boldsymbol{ \mathcal D }} & = & \left[\begin{array}{ccc}\tfrac{\partial U}{\partial {\mu }_{\alpha }} & \tfrac{\partial U}{\partial {\mu }_{\delta }} & \tfrac{\partial U}{\partial \nu }\\ \tfrac{\partial V}{\partial {\mu }_{\alpha }} & \tfrac{\partial V}{\partial {\mu }_{\delta }} & \tfrac{\partial V}{\partial \nu }\\ \tfrac{\partial W}{\partial {\mu }_{\alpha }} & \tfrac{\partial W}{\partial {\mu }_{\delta }} & \tfrac{\partial W}{\partial \nu }\end{array}\right].\end{eqnarray*}$

From the definition of Galactic coordinates (Johnson & Soderblom 1987):

$\begin{eqnarray}&&(X,Y,Z)=\varpi {\boldsymbol{\lambda }}(\alpha ,\delta )\end{eqnarray} \tag{ 25 }$

it is apparent that ${\boldsymbol{ \mathcal B }}=0$ . The determinant of $\bar{\bar{J}}$ can be obtained from the following property of block matrices, and further simplified:

$\begin{eqnarray*}| \bar{\bar{J}}| & = & | {\boldsymbol{ \mathcal A }}{\boldsymbol{ \mathcal D }}-0{\boldsymbol{ \mathcal C }}| ,\\ & = & | {\boldsymbol{ \mathcal A }}{\boldsymbol{ \mathcal D }}| ,\\ & = & | {\boldsymbol{ \mathcal A }}| \cdot | {\boldsymbol{ \mathcal D }}| .\end{eqnarray*}$

Using the definition of Galactic coordinates in Equation (25), the sub-matrix ${\boldsymbol{ \mathcal A }}$ can be simplified to:

$\begin{eqnarray*}{\boldsymbol{ \mathcal A }} & = & \left[\begin{array}{ccc}\varpi \tfrac{\partial {\lambda }_{0}}{\partial \alpha } & \varpi \tfrac{\partial {\lambda }_{0}}{\partial \delta } & {\lambda }_{0}\\ \varpi \tfrac{\partial {\lambda }_{1}}{\partial \alpha } & \varpi \tfrac{\partial {\lambda }_{1}}{\partial \delta } & {\lambda }_{1}\\ \varpi \tfrac{\partial {\lambda }_{2}}{\partial \alpha } & \varpi \tfrac{\partial {\lambda }_{2}}{\partial \delta } & {\lambda }_{2}\end{array}\right],\\ | {\boldsymbol{ \mathcal A }}| & = & {\varpi }^{2}{\lambda }_{0}\left(\displaystyle \frac{\partial {\lambda }_{1}}{\partial \alpha }\displaystyle \frac{\partial {\lambda }_{2}}{\partial \delta }-\displaystyle \frac{\partial {\lambda }_{1}}{\partial \delta }\displaystyle \frac{\partial {\lambda }_{2}}{\partial \alpha }\right)\\ & & -\,{\varpi }^{2}{\lambda }_{1}\left(\displaystyle \frac{\partial {\lambda }_{0}}{\partial \alpha }\displaystyle \frac{\partial {\lambda }_{2}}{\partial \delta }-\displaystyle \frac{\partial {\lambda }_{0}}{\partial \delta }\displaystyle \frac{\partial {\lambda }_{2}}{\partial \alpha }\right)\\ & & +\,{\varpi }^{2}{\lambda }_{2}\left(\displaystyle \frac{\partial {\lambda }_{0}}{\partial \alpha }\displaystyle \frac{\partial {\lambda }_{1}}{\partial \delta }-\displaystyle \frac{\partial {\lambda }_{0}}{\partial \delta }\displaystyle \frac{\partial {\lambda }_{1}}{\partial \alpha }\right),\\ | {\boldsymbol{ \mathcal A }}| & = & {\varpi }^{2}f(\alpha ,\delta ),\end{eqnarray*}$

where $f(\alpha ,\delta )$ is a function of sky coordinates only.

From the definition of space velocity (see Section 3.2):

$\begin{eqnarray*}&&(U,V,W)=\varpi {\boldsymbol{N}}(\alpha ,\delta ,{\mu }_{\alpha },{\mu }_{\delta })+\nu {\boldsymbol{M}}(\alpha ,\delta ),\end{eqnarray*}$

where the term $\cos \delta$ has been omitted in ${\mu }_{\alpha }\cos \delta$ , the sub-matrix $\bar{\bar{D}}$ can be simplified to:

$\begin{eqnarray*}{\boldsymbol{ \mathcal D }} & = & \left[\begin{array}{ccc}\varpi \displaystyle \frac{\partial {N}_{0}}{\partial {\mu }_{\alpha }} & \varpi \displaystyle \frac{\partial {N}_{0}}{\partial {\mu }_{\delta }} & {M}_{0}\\ \varpi \displaystyle \frac{\partial {N}_{1}}{\partial {\mu }_{\alpha }} & \varpi \displaystyle \frac{\partial {N}_{1}}{\partial {\mu }_{\delta }} & {M}_{1}\\ \varpi \displaystyle \frac{\partial {N}_{2}}{\partial {\mu }_{\alpha }} & \varpi \displaystyle \frac{\partial {N}_{2}}{\partial {\mu }_{\delta }} & {M}_{2}\end{array}\right],\\ | {\boldsymbol{ \mathcal D }}| & = & {\varpi }^{2}{M}_{0}\left(\displaystyle \frac{\partial {N}_{1}}{\partial {\mu }_{\alpha }}\displaystyle \frac{\partial {N}_{2}}{\partial {\mu }_{\delta }}-\displaystyle \frac{\partial {N}_{1}}{\partial {\mu }_{\delta }}\displaystyle \frac{\partial {N}_{2}}{\partial {\mu }_{\alpha }}\right)\\ & & -\,{\varpi }^{2}{M}_{1}\left(\displaystyle \frac{\partial {N}_{0}}{\partial {\mu }_{\alpha }}\displaystyle \frac{\partial {N}_{2}}{\partial {\mu }_{\delta }}-\displaystyle \frac{\partial {N}_{0}}{\partial {\mu }_{\delta }}\displaystyle \frac{\partial {N}_{2}}{\partial {\mu }_{\alpha }}\right)\\ & & +\,{\varpi }^{2}{M}_{2}\left(\displaystyle \frac{\partial {N}_{0}}{\partial {\mu }_{\alpha }}\displaystyle \frac{\partial {N}_{1}}{\partial {\mu }_{\delta }}-\displaystyle \frac{\partial {N}_{0}}{\partial {\mu }_{\delta }}\displaystyle \frac{\partial {N}_{1}}{\partial {\mu }_{\alpha }}\right),\\ | {\boldsymbol{ \mathcal D }}| & = & {\varpi }^{2}g(\alpha ,\delta ).\end{eqnarray*}$

where $g(\alpha ,\delta )$ is a function of sky coordinates only. The fact that g does not depend on the proper motion components arises from the fact that all components of ${\boldsymbol{N}}$ , defined in Section 3.2, depend linearly on the proper motion components. It follows that:

$\begin{eqnarray*}&&| \bar{\bar{J}}| \propto {\varpi }^{4}.\end{eqnarray*}$

Appendix B: Solving the Marginalization Integrals

In this section, the analytical solution to the marginalization integrals of Equation (3), over distance and radial velocity, is developed. The index k that referring to hypothesis H_k is ignored in this section for simplicity. The Bayesian likelihood in the Galactic frame of reference {Q_i} described in Equation (11) can be inserted in Equation (3) to obtain:

$\begin{eqnarray*}&&{ \mathcal P }(\{{O}_{i}\}| H)={\int }_{0}^{\infty }{\int }_{-\infty }^{\infty }{\varpi }^{4}\displaystyle \frac{{e}^{-\tfrac{1}{2}{{ \mathcal M }}^{2}}}{\sqrt{{(2\pi )}^{6}| \bar{\bar{{\rm{\Sigma }}}}| }}\,d\nu \,d\varpi ,\end{eqnarray*}$

and the Mahalanobis distance ${ \mathcal M }$ defined in Equation (10) can be used to develop it further:

$\begin{eqnarray}&&{ \mathcal P }(\{{O}_{i}\}| H)={C}_{0}{\int }_{0}^{\infty }{I}_{0}(\varpi ){\varpi }^{4}{e}^{-\tfrac{1}{2}\langle \bar{{\rm{\Gamma }}},\bar{{\rm{\Gamma }}}\rangle {\varpi }^{2}+\langle \bar{{\rm{\Gamma }}},\bar{\tau }\rangle \varpi }\,d\varpi ,\end{eqnarray} \tag{ 26 }$

$\begin{eqnarray}{I}_{0}(\varpi ) & = & {\displaystyle \int }_{-\infty }^{\infty }{e}^{-\tfrac{1}{2}\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Omega }}}\rangle {\nu }^{2}+(\langle \bar{{\rm{\Omega }}},\bar{\tau }\rangle -\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Gamma }}}\rangle \varpi )\nu }\,d\nu ,\\ {C}_{0} & = & \displaystyle \frac{{e}^{-\tfrac{1}{2}\langle \bar{\tau },\bar{\tau }\rangle }}{\sqrt{{(2\pi )}^{6}| \bar{\bar{{\rm{\Sigma }}}}| }}.\end{eqnarray} \tag{ 27 }$

Equation (27) can be solved with the identity:

$\begin{eqnarray*}&&{\int }_{-\infty }^{\infty }{e}^{-{{ax}}^{2}-{bx}}\,{dx}=\sqrt{\displaystyle \frac{\pi }{a}}{e}^{{b}^{2}/4a},\end{eqnarray*}$

with $a=\tfrac{1}{2}\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Omega }}}\rangle$ and $b=-\langle \bar{{\rm{\Omega }}},\bar{\tau }\rangle +\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Gamma }}}\rangle \varpi$ .

The term in b² can be developed into a second-degree polynomial in ϖ:

$\begin{eqnarray*}&&\displaystyle \frac{{b}^{2}}{4a}=\displaystyle \frac{1}{2}\displaystyle \frac{{\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Gamma }}}\rangle }^{2}}{\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Omega }}}\rangle }{\varpi }^{2}-\displaystyle \frac{\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Gamma }}}\rangle \langle \bar{{\rm{\Omega }}},\bar{\tau }\rangle }{\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Omega }}}\rangle }\varpi +\displaystyle \frac{1}{2}\displaystyle \frac{{\langle \bar{{\rm{\Omega }}},\bar{\tau }\rangle }^{2}}{\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Omega }}}\rangle },\end{eqnarray*}$

which can be inserted back into Equation (26):

$\begin{eqnarray}P(\{{O}_{i}\}| H) & = & \displaystyle \frac{{| \bar{{\rm{\Omega }}}| }^{-1}{e}^{-\zeta }}{\sqrt{{(2\pi )}^{5}| \bar{\bar{{\rm{\Sigma }}}}| }}{\displaystyle \int }_{0}^{\infty }{\varpi }^{4}{e}^{-\beta {\varpi }^{2}-\gamma \varpi }d\varpi ,\\ \mathrm{where}\,\beta & = & \displaystyle \frac{\langle \bar{{\rm{\Gamma }}},\bar{{\rm{\Gamma }}}\rangle }{2}-\displaystyle \frac{1}{2}\displaystyle \frac{{\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Gamma }}}\rangle }^{2}}{\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Omega }}}\rangle },\\ \gamma & = & \displaystyle \frac{\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Gamma }}}\rangle \langle \bar{{\rm{\Omega }}},\bar{\tau }\rangle }{\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Omega }}}\rangle }-\langle \bar{{\rm{\Gamma }}},\bar{\tau }\rangle .\\ \zeta & = & \displaystyle \frac{\langle \bar{\tau },\bar{\tau }\rangle }{2}-\displaystyle \frac{1}{2}\displaystyle \frac{{\langle \bar{{\rm{\Omega }}},\bar{\tau }\rangle }^{2}}{\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Omega }}}\rangle }.\end{eqnarray} \tag{ 28 }$

The solution to this integral is given by Erdelyi (1955) and Gradshteyn & Ryzhik (2014):

$\begin{eqnarray*}&&{\int }_{0}^{\infty }{x}^{n}{e}^{-\beta {x}^{2}-\gamma x}\ {dx}=\displaystyle \frac{n!\ {e}^{{\gamma }^{2}/8\beta }}{{(2\beta )}^{(n+1)/2}}\ {{ \mathcal D }}_{-(n+1)}(\gamma /\sqrt{2\beta }),\end{eqnarray*}$

where ${{ \mathcal D }}_{m}(x)$ is a parabolic cylinder function (Magnus & Oberhettinger 1948).

The m = −5 (n = 4) corresponds to the required integral, and the corresponding parabolic cylinder function can be developed as:

$\begin{eqnarray*}{{ \mathcal D }}_{-5}(x) & = & \displaystyle \frac{{e}^{{x}^{2}/4}}{24}\left(\sqrt{\displaystyle \frac{\pi }{2}}({x}^{4}+6{x}^{2}+3)\mathrm{erfc}\left(\tfrac{x}{\sqrt{2}}\right)\right.\\ & & -\,({x}^{3}+5x){e}^{-{x}^{2}/2}).\end{eqnarray*}$

Equation (28) becomes:

$\begin{eqnarray}&&{ \mathcal P }(\{{O}_{i}\}| H)=\displaystyle \frac{3}{4}\displaystyle \frac{{{ \mathcal D }}_{-5}(\gamma /\sqrt{2\beta }){e}^{{\gamma }^{2}/8\beta -\zeta }}{| \bar{{\rm{\Omega }}}| \sqrt{{\pi }^{5}{\beta }^{5}| \bar{\bar{{\rm{\Sigma }}}}| }}.\end{eqnarray} \tag{ 29 }$

The term in ${e}^{{x}^{2}/4}$ in the definition of ${{ \mathcal D }}_{-5}(x)$ can become very large for typical values of x, making the numerical computation of ${{ \mathcal D }}_{-5}(x)$ unstable. To avoid this problem, a modified parabolic cylinder function ${{ \mathcal D }}_{-5}^{{\prime} }(x)$ can be defined so that the large term is combined with the exponential term in Equation (29):

$\begin{eqnarray*}{ \mathcal P }(\{{O}_{i}\}| H) & = & \displaystyle \frac{1}{32}\displaystyle \frac{{{ \mathcal D }}_{-5}^{{\prime} }(\gamma /\sqrt{2\beta }){e}^{{\gamma }^{2}/4\beta -\zeta }}{| \bar{{\rm{\Omega }}}| \sqrt{{\pi }^{5}{\beta }^{5}| \bar{\bar{{\rm{\Sigma }}}}| }},\\ {{ \mathcal D }}_{-5}^{{\prime} }(x) & = & 24\,{e}^{-{x}^{2}/4}\,{{ \mathcal D }}_{-5}(x).\end{eqnarray*}$

which completes the analytical solution of the Bayesian likelihood. The 1/32 factor will be ignored here because it will disappear in the marginalization of the Bayesian likelihood (Equation (1)), and other multiplicative factors independent of the Bayesian hypothesis have already been ignored in determining the Jacobian of the coordinate transformation (see Appendix A).

Appendix C: Determining the Optimal Radial Velocity and Distance

The optimal radial velocity ν_o and distance ϖ_o that maximize the value of the non-marginalized Bayesian likelihood ${{ \mathcal P }}_{o}(\{{O}_{i}\}| H)$ can be obtained by solving the system of equations:

$\begin{eqnarray*}{\left.\displaystyle \frac{\partial \mathrm{ln}{{ \mathcal P }}_{o}(\{{O}_{i}\}| H)}{\partial \nu }\right|}_{\nu ={\nu }_{{\rm{o}}},\varpi ={\varpi }_{{\rm{o}}}} & = & 0,\\ {\left.\displaystyle \frac{\partial \mathrm{ln}{{ \mathcal P }}_{o}(\{{O}_{i}\}| H)}{\partial \varpi }\right|}_{\nu ={\nu }_{{\rm{o}}},\varpi ={\varpi }_{{\rm{o}}}} & = & 0,\end{eqnarray*}$

which can be developed with Equation (10):

$\begin{eqnarray*}0 & = & \langle \bar{{\rm{\Omega }}},\bar{{\rm{\Omega }}}\rangle {\nu }_{{\rm{o}}}+\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Gamma }}}\rangle {\varpi }_{{\rm{o}}}-\langle \bar{{\rm{\Omega }}},\bar{\tau }\rangle ,\\ 0 & = & \langle \bar{{\rm{\Gamma }}},\bar{{\rm{\Gamma }}}\rangle {\varpi }_{{\rm{o}}}^{2}+\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Gamma }}}\rangle {\varpi }_{{\rm{o}}}{\nu }_{{\rm{o}}}-\langle \bar{{\rm{\Gamma }}},\bar{\tau }\rangle {\varpi }_{{\rm{o}}}-4.\end{eqnarray*}$

This system of equations has two solutions:

$\begin{eqnarray*}{\varpi }_{{\rm{o}}} & = & \displaystyle \frac{-\gamma \pm \sqrt{{\gamma }^{2}+32\beta }}{4\beta },\\ {\nu }_{{\rm{o}}} & = & \displaystyle \frac{4+\langle \bar{{\rm{\Gamma }}},\bar{\tau }\rangle {\varpi }_{{\rm{o}}}-\langle \bar{{\rm{\Gamma }}},\bar{{\rm{\Gamma }}}\rangle {\varpi }_{{\rm{o}}}^{2}}{\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Gamma }}}\rangle {\varpi }_{{\rm{o}}}},\end{eqnarray*}$

where γ and β are defined in Equations (14) and (13).

Any combination of γ and β that respects the following inequality:

$\begin{eqnarray*}\sqrt{1+32\beta /{\gamma }^{2}} & \gt & 1,\\ \,{\rm{i}}.{\rm{e}}.,\ \beta & \gt & 0,\end{eqnarray*}$

will yield an unphysical negative distance for the negative root of ϖ_o. Since multivariate Gaussians have β > 0 by definition, the inequality is always respected. As a consequence, only the positive root of ϖ_o has a physical meaning.

Error bars on the optimal radial velocity σ_ν and distance σ_ϖ can be defined by measuring the characteristic width of ${{ \mathcal P }}_{o}(\{{O}_{i}\}| H)$ along ν and ϖ in the vicinity of (ν_o, ϖ_o). The effect of the Jacobian term ϖ⁴ will be ignored to obtain an analytical approximation of (σ_ν, σ_ϖ).

The relation between the expectancy E(x) of a variable and the characteristic width of a Gaussian function G(x) can be used to determine σ_ν and σ_ϖ:

$\begin{eqnarray*}{\sigma }_{x} & = & \sqrt{E({x}^{2})-E{(x)}^{2}},\\ E(x) & = & {\displaystyle \int }_{-\infty }^{\infty }{xG}(x){dx}.\end{eqnarray*}$

In the case of σ_ν, this yields:

$\begin{eqnarray*}E(\nu ) & = & \displaystyle \frac{{\displaystyle \int }_{-\infty }^{\infty }\nu {e}^{-{\beta }_{\nu }{\nu }^{2}-{\gamma }_{\nu }\nu }\,d\nu }{{\displaystyle \int }_{-\infty }^{\infty }{e}^{-{\beta }_{\nu }{\nu }^{2}-{\gamma }_{\nu }\nu }\,d\nu },\\ & = & \displaystyle \frac{{\gamma }_{\nu }}{2{\beta }_{\nu }},\\ E({\nu }^{2}) & = & \displaystyle \frac{{\displaystyle \int }_{-\infty }^{\infty }{\nu }^{2}{e}^{-{\beta }_{\nu }{\nu }^{2}-{\gamma }_{\nu }\nu }\,d\nu }{{\displaystyle \int }_{-\infty }^{\infty }{e}^{-{\beta }_{\nu }{\nu }^{2}-{\gamma }_{\nu }\nu }\,d\nu },\\ & = & \displaystyle \frac{2{\beta }_{\nu }+{\gamma }_{\nu }^{2}}{4{\beta }_{\nu }^{2}},\\ {\beta }_{\nu } & = & \displaystyle \frac{\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Omega }}}\rangle }{2},\\ {\gamma }_{\nu } & = & {\varpi }_{0}\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Gamma }}}\rangle -\langle \bar{{\rm{\Omega }}},\bar{\tau }\rangle ,\end{eqnarray*}$

leading to:

$\begin{eqnarray*}{\sigma }_{\nu } & = & \displaystyle \frac{1}{\sqrt{2{\beta }_{\nu }}},\\ {\sigma }_{\nu } & = & | \bar{{\rm{\Omega }}}{| }^{-1}.\end{eqnarray*}$

The case of σ_ϖ requires the introduction of a new variable ϖ' that is defined in the range $]-\infty ,\infty [$ and matches the distance ϖ' = ϖ for ϖ' ≥ 0. Assuming that ϖ_o ≫ σ_ϖ will ensure that the Bayesian likelihood ${{ \mathcal P }}_{0}(\nu ,{\varpi }^{{\prime} })\approx 0$ for all negative values of ϖ'. It follows that:

$\begin{eqnarray*}E(\varpi ) & \approx & \displaystyle \frac{{\displaystyle \int }_{-\infty }^{\infty }{\varpi }^{{\prime} }\,{e}^{-{\beta }_{\varpi }{\varpi }^{2}-{\gamma }_{\varpi }\varpi }\,d{\varpi }^{{\prime} }}{{\displaystyle \int }_{-\infty }^{\infty }{e}^{-{\beta }_{\varpi }{\varpi }^{2}-{\gamma }_{\varpi }\varpi }\,d{\varpi }^{{\prime} }}\\ & = & \displaystyle \frac{{\gamma }_{\varpi }}{2{\beta }_{\varpi }},\\ E({\varpi }^{2}) & \approx & \displaystyle \frac{2{\beta }_{\varpi }+{\gamma }_{\varpi }^{2}}{4{\beta }_{\varpi }^{2}},\\ {\beta }_{\varpi } & = & \displaystyle \frac{\langle \bar{{\rm{\Gamma }}},\bar{{\rm{\Gamma }}}\rangle }{2},\\ {\gamma }_{\varpi } & = & {\nu }_{{\rm{o}}}\langle \bar{{\rm{\Omega }}},\bar{{\rm{\Gamma }}}\rangle -\langle \bar{{\rm{\Gamma }}},\bar{\tau }\rangle ,\end{eqnarray*}$

leading to:

$\begin{eqnarray*}&&{\sigma }_{\varpi }\approx | \bar{{\rm{\Gamma }}}{| }^{-1}.\end{eqnarray*}$

The optimal distance and radial velocity do not correspond exactly to the statistical distance and radial velocities defined in the BANYAN II formalism (Gagné et al. 2014). The latter are obtained by maximizing the Bayesian likelihood in one dimension after the other dimension was marginalized. The optimal distance and radial velocity maximize the Bayesian probability of a given hypothesis as a couple, whereas the BANYAN II statistical distance maximizes the Bayesian probability when radial velocity is treated as an unknown parameter, and vice versa.

BANYAN. XI. The BANYAN Σ Multivariate Bayesian Algorithm to Identify Members of Young Associations with 150 pc

Article metrics

Permissions

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Abstract

1. Introduction

2. The Banyan II Algorithm

3. Banyan Σ: An Improved Algorithm

3.1. Kinematic Models

3.2. Change of Coordinates

3.3. Solving the Marginalization Integrals

3.4. Optimal Radial Velocity and Distance

3.5. Approximating the Effect of Proper Motion Measurement Errors

3.6. Parallax Motion

3.7. Additional Kinematic Observables

3.8. Photometric Observables

3.9. Ignoring the Galactic Position XYZ

4. Bona Fide Members of Young Associations within 150 pc

4.1. Associations With Full Kinematics

4.2. Associations With Partial Kinematics

4.3. Rejected Associations

4.4. Discussion on Individual Objects

4.5. New Bona Fide Members

4.6. Calculation of the 6D Kinematics

5. Kinematic Models of Young Associations

6. A Model of Field Stars in the Solar Neighborhood

6.1. The Spatial Size of Proper Motion and Galactic Latitude-limited Stellar Samples

7. The Choice of Bayesian Priors

8. The Performance of Banyan Σ as a Bayesian Classifier

9. Classifying Previously Ambiguous Members with Banyan Σ

10. Summary and Conclusions

Appendix A: Coordinate Transformation of the Bayesian Likelihood

Appendix B: Solving the Marginalization Integrals

Appendix C: Determining the Optimal Radial Velocity and Distance

Footnotes

BANYAN. XI. The BANYAN Σ Multivariate Bayesian Algorithm to Identify Members of Young Associations with 150 pc

Article metrics

Permissions

Share this article

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Abstract

1. Introduction

2. The Banyan II Algorithm

3. Banyan Σ: An Improved Algorithm

3.1. Kinematic Models

3.2. Change of Coordinates

3.3. Solving the Marginalization Integrals

3.4. Optimal Radial Velocity and Distance

3.5. Approximating the Effect of Proper Motion Measurement Errors

3.6. Parallax Motion

3.7. Additional Kinematic Observables

3.8. Photometric Observables

3.9. Ignoring the Galactic Position XYZ

4. Bona Fide Members of Young Associations within 150 pc

4.1. Associations With Full Kinematics

4.2. Associations With Partial Kinematics

4.3. Rejected Associations

4.4. Discussion on Individual Objects

4.5. New Bona Fide Members

4.6. Calculation of the 6D Kinematics

5. Kinematic Models of Young Associations

6. A Model of Field Stars in the Solar Neighborhood

6.1. The Spatial Size of Proper Motion and Galactic Latitude-limited Stellar Samples

7. The Choice of Bayesian Priors

8. The Performance of Banyan Σ as a Bayesian Classifier

9. Classifying Previously Ambiguous Members with Banyan Σ

10. Summary and Conclusions

Appendix A: Coordinate Transformation of the Bayesian Likelihood

Appendix B: Solving the Marginalization Integrals

Appendix C: Determining the Optimal Radial Velocity and Distance

Footnotes