Uncovering a new group of T Tauri stars in the Taurus-Auriga molecular complex from Gaia and GALEX data

In this work, we examine the list of 63 candidates to T Tauri star (TTS) in the TAMC identified by their ultraviolet (UV) and infrared colours (IR) measured from data obtained by the Galaxy Evolution Explorer all sky survey (GALEX-AIS) and the Two Microns All Sky Survey (2MASS), respectively. The objective of this work is twofold: evaluate whether they are pre-main sequence (PMS) stars and evaluate the goodness of the UV-IR colour-colour diagram to detect PMS stars in wide-fields. The astrometric properties of these sources have been retrieved from the Gaia DR3 catalogue and used to evaluate their membership probability. Several classification algorithms have been tested to search for the kinematical groups but the final classification has been made with k-means++ algorithms. Membership probability has been evaluated by applying Logistic Regression. In addition, spectroscopic information available in the archive of the Large Sky Area Multi Object Fiber Spectroscopic Telescope has been used to ascertain their PMS nature when available. About 20% of the candidates share the kinematics of the TAMC members. Among them, HD 281691 is a G8-type field star located in front of the cloud and HO Aur is likely a halo star given the very low metallicity provided by Gaia. The rest are three known PMS stars (HD 30171, V600 Aur and J04590305+3003004), two previously unknown accreting M-type stars (J04510713+1708468 and J05240794+2542438) and, five additional sources, which are very likely PMS stars. Most of these new sources are concentrated at low galactic latitudes over the Auriga-Perseus region.


Introduction
The accurate determination of the low-mass end of the initial mass function (IMF) is challenging but fundamental to understand the efficiency of star formation and to constrain the significance of the low-mass leftovers in the evolution of galaxies over cosmic scales.There have been systematic attempts to make a complete accounting of this "hidden" population in the solar neighbourhood where the sensitivity for this investigation is optimal (see i.e Luhman 2018 and references therein).The nearest stellar nurseries are located within 200 pc, in a rim of molecular clouds that includes prominent star forming regions such as Taurus-Auriga, Lupus, Chamaleon, Ophiuchus or Sco-Cen.As a result, the Solar System is in a privileged location within the Galaxy to investigate the formation of loose stellar associations and determine the low mass end of their IMF.
The Taurus-Auriga molecular complex (TAMC) is the best studied region in the area.The TAMC is part of the prominent ⋆⋆ ridge of molecular gas observed towards the anticenter of the Galaxy; large-scale maps of the region date back to the CO survey carried with the Columbia millimeter-wave telescope by Ungerechts & Thaddeus (1987).The TAMC roughly extends over 20 o × 20 o in the sky (see Fig. 1) and it is constituted by several filaments and cores (Goldsmith et al. 2008;Garufi et al. 2021;Roccatagliata et al. 2020;Long et al. 2019).There are more than 500 sources sparsely distributed over the complex that could be pre-main sequence (PMS) stars (see Luhman 2022 for a recent accounting).The very young population (age ≤ 1Myr) is well identified by its significant near-infrared excess, its proximity to the parent molecular gas and the presence of jets of molecular outflows; however, the census of the most evolved sources, in particular, the so-called weak line T Tauri stars (WTTSs), is still to be completed, especially at the very low mass end towards the brown dwarfs frontier.
The release of the Gaia DR2 catalogue (Gaia Collaboration et al. 2018) enabled for the first time an unbiased search for WTTSs in the region based on the kinematical properties of the stars.The Gaia survey is magnitude-limited to G ≃ 21 and thus, it is sensitive to the low mass stellar population (late M types) in the TAMC which is located at ∼ 140 pc.Galli et al. (2019) generated a compilation of all the possible T Tauri stars (TTSs) in the region including both, confirmed spectroscopic sources and possible candidates, which were identified by the surveys carried out with the Spitzer infrared telescope (Luhman et al. 2010), the Sloan Digital Sky Survey and the 2-Microns All Sky Survey (Luhman et al. 2017).Using a Hierarchical Mode Association Clustering (HMAC) algorithm, they classified these sources into 21 distinct kinematical groups.A recent re-evaluation based on Gaia DR3 (Gaia Collaboration et al. 2022) data has resulted in a grand total of 532 adopted members distributed in 13 kinematical groups (Luhman 2022).This kinematical search does not provide a complete accounting of the young stars in the region since young stars with peculiar proper motions such as stars ejected in close encounters or stars in multiple systems, will be rejected as outliers by the classification algorithms.Also, no information on their youth and evolutionary state can be obtained in this manner.An additional problem is the scarce and uncertain information provided by Gaia DR3 on the radial velocity of these stars which makes difficult an accurate determination of the three components of the LSR velocity (U, V, W) for a large fraction of the sources.
The most common and successful ways to search for WTTSs in molecular clouds are based on their strong magnetic activity which results in an enhancement of the atmospheric flux: coronal emission at X-ray, the radiation from the transition region at ultraviolet (UV) wavelengths and the chromospheric radiation measurable at UV but also at optical wavelengths.In fact, the first catalogues of TTSs were built from stars displaying enhanced chromospheric emission at the Hα line (see e.g.Herbig & Kameswara Rao 1972).Many WTTSs candidates were identified in the TAMC from the ROSAT all-sky X-ray survey (Neuhaeuser et al. 1995) and subsequent observations (especially of the Li I absorption) proved that many were indeed, WTTSs.The release of the "all sky UV survey" carried out by the Galaxy Evolution Explorer (GALEX) enabled to search for TTSs by their UV excess; a list of 63 new TTSs candidates was elaborated by comparing the UV color and the IR color excesses in the TAMC (Gómez de Castro et al. 2015, hereafter GdC2015).Gaia was launched in December 2013 and its subsequent data releases came much later than the publication of this list thus, neither parallax nor kinematical information was available at the time to narrow down this large list of candidates.It is the purpose of this research to carry out such analysis, to identify the bona fide TTSs in the sample.For this purpose, we use the Gaia DR3 parallax and kinematical information as well as spectra from the spectroscopic survey carried out by the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) when available, to determine the spectral type and the Li I equivalent width.This article is structured as follows.In section 2, the characteristics of the GdC2015 sources are described.Sources with Gaia DR3 parallaxes incompatible with TAMC membership are disregarded reducing the working sample of viable candidates to 13 sources.In section 3, the membership probability of these candidates is evaluated using Gaia DR3 data.In the process, a list of trusted TTSs in the TAMC for kinematical studies is produced; this compilation is significantly shorter than that produced by Luhman (2022).In section 4, LAMOST spectra are analyzed; two new accreting late M-type stars are identified in the GdC2015 sample.Finally, a short discussion (Sect.5) on the radial velocity distribution of the TTSs and the coupling with the molecular gas is included.A short summary of the work is at the end, in Section 6. Technicalities concerning the error analysis of the samples are compiled in the Appendix to facilitate a comprehensive reading of the work.

UV candidates to T Tauri stars in the
Taurus-Auriga region The GALEX All Sky Survey (AIS) (Martin et al. 2005) provided for the first time an unbiased view of the TAMC at UV wavelengths.Though the area was not mapped completely due to GALEX sensitivity constraints, 380 square degrees in the sky (equivalent to 197 GALEX fields) were imaged and as many as ∼163,000 UV sources were detected in the near ultraviolet (NUV) band (see GdC2015).Through a comparison of UV and IR colours, 63 new candidates to TTSs were identified; their lo-cation is shown in the top panel of Fig. 2 and their UV-IR colours in the bottom panel.
These 63 sources were selected using a qualification sample of TTSs that was generated from two sets of sources: [1] the 31 known TTSs detected in the GALEX AIS survey of the TAMC and [2] the 21 TTSs observed with the International Ultraviolet Explorer (IUE) in the low dispersion mode with high signal-to-noise ratio spectra (see Table 1 in GdC2015); synthetic GALEX photometry was calculated from the IUE spectra for these sources using Morrisey et al. (2007) conversion.2MASS photometry was available for all the sources in the qualification sample.All these stars are marked with blue squares (if CTTSs) and red circles (if WTTSs) in the bottom panel of Fig. 2. The 63 candidates to be evaluated in this work appear as crosses in the plot.
Note that the WTTSs define a clear regression line with the hardest UV colours being observed in the sources displaying the smallest J-K color.CTTSs however, are scattered in the diagram.Most of the candidates studied in this work were identified in the WTTSs strip (see Fig. 2, bottom panel) close to the location of reddened massive cool stars and many are located close to Galactic plane (see Fig. 2, top panel).Thus, further observables such as distance or kinematics are needed to confirm the PMS nature of these sources.
Gaia DR3 counterparts have been searched and found for all candidates using a 2 arcsecond search radius employing the cross match services provided by the Centre de Données Stellaires (CDS).
In Fig. 3, we display the projected location on the plane of the sky of all sources within 1,200 pc.Only 13 among them have parallaxes compatible with TAMC membership; i.e. 5 mas − 9 mas, according to Roccatagliata et al. (2020).Seven among them are concentrated in the north of the region, close to the Galactic plane, but are separated in two distance groups at ∼ 110 pc and ∼ 160 pc.

Kinematical properties of the candidates
In addition to distance, stellar kinematics needs to be analyzed.This has been done in two steps.Firstly, the kinematics of known and trusted TTSs in the TAMC with high quality Gaia measurements is studied.1 Kinematical groups are identified within this sample using the k-means ++ algorithm and then, the probability of the 13 candidates being kinematical members is evaluated.Only trusted PMS sources with high quality astrometric data will be considered for this qualification sample.The Gaia Data Processing and Analysis Consortium states that a value of the renormalised unit weight error (RUWE) ≤1.4 is indicative of a good astrometric solution.2thus, only Gaia observations with RUWE ≤1.4 have been considered for the qualification of the kinematics of the TTSs in the TAMC.As an aside, note that according to Stassun & Torres (2021) RUWE values even slightly above 1.0 may be signaling unresolved binaries in Gaia data.

Kinematics of known TTSs in the TAMC
The initial sample of trustable TTSs in the TAMC has been taken from Joncour et al. (2017); more recent catalogues are available, Thus, the initial qualification sample consisted of these 170 TTS with high quality astrometric measurements; however, after cross-checking Gaia parallaxes three sources were found no to belong to the TAMC and have not been considered (see Table 1).The final qualification sample consists of 92 classical TTSs (CTTSs) and 75 weak-line TTSs (WTTSs) (see Table 2).
Gaia proper motions in the ICRS system have been converted into the Galactic system using the procedure described in the Gaia EDR3 documentation. 3 In Fig 4, these proper motions are overlaid on the map obtained by the Plank mission of the area where the distribution of the warm dust and the orientation of the magnetic field are shown (Planck Collaboration et al. 2016).The TTSs motion is very coherent as expected in a young association that still keeps memory of the motion of the parent molecular cloud; the young gravitationally bound open cluster contains hundreds of members with similar ages and composition; however, the kinematical memory of this common ori-gin will be lost over time due to the interaction with the large scale gravitational field of the Galaxy.As pointed out back in the early 90's (Gomez de Castro & Pudritz 1992) the TTSs star move roughly perpendicularly to the direction of the magnetic field traced by the polarization of the dust.
There are mild differences between WTTSs and CTTSs.This is best shown by the µ b /µ l ratio (see Figs. 4,5); while CTTSs have a single peak distribution with a smooth tail to small ratios, the WTTSs display a double peaked distribution.As CTTSs are younger stars, it is expected that they are better coupled to the gas and keep a common kinematics.This effect is also observed in the spatial distribution of the accreting TTSs and the WTTSs; CTTSs are located close to the molecular cloud while WTTSs are more sparsely distributed (Luhman et al. 2010).Two outliers are detected: the CTTSs, LkHα 332 (2MAS S J04420777 + 2523118) and the WTTSs 2MAS S J04555288 + 3006523; LkHα 332 is at 11.3 arcsec of V1000 Tau, another TTSs.

k-cluster analysis
The TAMC occupies a large area in the sky and several kinematic groups have been identified within the association (see e.g.Galli et al. 2019;Luhman 2023).In this work, we have used the k-means++ clustering algorithm to determine which groups of stars in our qualification sample could be kinematically related.Given their young age, kinematically coherent sources may have been born out of the same gas cloudlet within the TAMC.This algorithm is an unsupervised machine-learning technique that  performs a centroid-based analysis using iterative refinement (Lloyd 2006;Arthur & Vassilvitskii 2007).In our case, we carried out a multiparametric analysis that included both positional information (parallax) and kinematic information (the projections of the proper motion along the Galactic coordinates), i.e. our data set has three parameters.The k-means++ algorithm is not density-based like DBSCAN or OPTICS, but partitional so every object in the sample is assigned to one and only one cluster.This algorithm implicitly assumes that clusters are convex and isotropic and it performs best when applied to a mixture of Gaussian distributions with the same variances but perhaps different means.This mathematical precondition matches well the physical context of the TAMC sample when parallaxes and proper motions are considered.As our analysis is not including the coordinates of the sources but parallaxes and proper motions, the actual distributions of the data are rather isotropic and normally distributed.
For each one of the 167 TTSs (92 CTTSs and 75 WTTSs), a three-dimensional vector was generated.An initial set of k points (centroids) was defined in this three-dimensional space and k clusters were created by associating every vector with the nearest centroid.In a second step, a new centroid was computed for each cluster by calculating the mean among the vectors.Later, the association of each vector to a given cluster was re-evaluated for the new centroid.The process was repeated until convergence was reached.
Here, we used the k-means++ algorithm as implemented by the Python library Scikit-learn (Pedregosa et al. 2011).Before applying the k-means++ algorithm, we scaled the data set using Z-score normalization: found the mean and standard deviation for each set of variables used in the analysis, subtracted the relevant mean from each value, and then divided by its corresponding standard deviation.Although the variables may originally have different variances, Z-score normalization standardizes variances and avoids placing more weight on variables with smaller variance when applying k-means++.Distance assignment between the data points in our sample assumed a Euclidean metric (other metrics gave consistent results) and we used the el- bow method to determine the optimal value of clusters, k, that minimized the sum of the distances of all data points to their respective cluster centres.Test analyses using OPTICS on the same data set led to similar results.
This procedure was applied to the list of bona-fide TTSs (Table 2), but in order to take into account the associated uncertainties and their correlations, we generated 10 4 instances of the sample as explained in Appendix A and for each synthetic instance, we determined the clusters through the k-means++ algorithm.In almost all the cases, our clustering analysis produced two statistically significant clusters which are represented in Fig. 6.Sources in the first group are located at an average distance of 160 pc while those in the second group are found at 130 pc.This clean classification in two groups differs from the results of the previous works by Galli et al. (2019) and Luhman (2023).Galli et al. (2019) uses Hierarchical Association Clustering Mode (HACM) which is a non-parametric statistical approach used for clustering analysis.HACM finds the modes of a kernel-based estimate of the density of points in the working space and groups the data points associated with the same modes into one cluster with arbitrary shape.Clustering by mode identification requires only the space and the bandwidth of the kernel to be defined.In particular, Galli et al. ( 2019) make use of a 5dimensional space consisting in the equatorial coordinates of the sources, their proper motions in equatorial coordinates and the parallax (α, δ, µ δ , µ α cos δ, π).The algorithm identifies 21 clusters hierarchically linked in the dendrogram only 4 of them having more than 10 sources.This result is natural since the coordinates are taken into account and the stellar population is sparsely distributed in the TAMC thus, stars with similar proper motions and parallaxes located in different areas are identified as independent groups.Luhman (2023) follows a different approach.Gaia astrometry is analyzed in terms of proper motions offsets defined as the differences between the observed proper motion of the star and the the motion expected at the celestial coordinates and parallactic distance of the star for a specified space velocity.This procedure is set to minimize the projection effects given the large extent of the TAMC and the offsets are calculated with respect to the expected LSR velocity given in term of the velocity vector (U,V,W) = (-16, -12, -9) km s −1 , which according to Luhman (2018) approximates the median velocity of Taurus members.9 groups are identified in this manner (see Fig. 1 and Table 6 in Luhman (2023) ).These groups do not coincide with those in Galli et al. (2019) but some of the levels in the hierarchy defined by Galli et al. (2019) contain some of these groups.In our work, the classification is made with a k-means++ clustering algorithm which does not provide a hierarchical classification, neither a dendrogram suitable for comparison with Galli et al. (2019).Moreover, the analysis in Sect.3.1, as well as previous works (Gomez de Castro & Pudritz 1992), show that galactic coordinates are better suited for the kinematical analysis of the region (see Fig. 4).Proper motions are converted from equatorial to galactic coordinates and the cluster analysis is carried in a 3-dimensional space including these proper motions and the parallax (µ b , µ l cosb, π).The number of groups identified by the algorithm is reduced to 2 (groups 0 and 1) with a clear difference between them in the proper motion space and distance.Groups L1544 and L1517 in Luhman (2023) belong to group 0 and 1 in this work, respectively.However, the many groups identified by Luhman (2023) in the center of the TAMC belong to one of the two underlying groups identified in this work.Our method does not assume a common LSR velocity for the full TAMC, as in Luhman (2023) and does nor require to know the radial velocity of the stars to evaluate the differences in the (U,V,W) because there very few sources with accurate determinations of the radial velocity (see, Galli et al. (2019) ).We just use high quality data from Gaia DR3 and the analysis of the properties of the TAMC to use optimal dynamical tracers for the classification.This strategy shows to be very efficient in the description of the region.

Kinematics of the TTS candidates and membership probability
So far, we have discussed the kinematics in the TAMC based on the sample of 167 'bona fide' TTS.In this section, we use the kinematic classification performed in Sec.3.1 in order to assign a membership probability to each group of each of the 13 TTS candidates and to that purpose, we build a logistic regression model using the Scikit-learn package (Pedregosa et al. 2011).A detailed explanation of this method with an application to astrophysics is provided in Beitia-Antero et al. (2018); in this work, we only provide a simple explanation with the basics of the methodology as a guide to the reader for our particular case.
We work with a qualification sample of N = 167 stars (the 'bona fide' TTS) that are classified into two groups based on their values of (µ l * , µ b , π) as explained in the previous section, so the classification depends on three features and is binary (groups k = {1, 0}).In this case, the logistic regression model gives the probability that a given star x i belongs to Group 1, since the probability of belonging to Group 0 is the complementary: where the β m are the logistic regression parameters fitted using the qualification sample, that have a median value of β 0 = 17.052, β 1 = −0.085,β 2 = 0.616, β 3 = −1.575(see Appendix B for more details on how these parameters were computed taking into account the associated errors on proper motions and parallax).We analyzed the confusion matrix and only one star is misclassified, so this choice of parameters (proper motions in galactic coordinates and parallax) gives a good separation of the groups.
Once the logistic regression model is trained, we apply it to the sample of TTS candidates, and following the methodology explained in Appendix B, we have taken into account the errors in the measurements and computed a median probability of belonging to any of the two groups; these probabilities are shown in Table 3.The location of the candidates in the proper motions-distance diagram is shown in Fig. 7.Only sources No. 3 (J04423721+3401492) and No. 12 (V600 Aur) display properties very different from those of the qualification sample.We want to highlight that membership probability has been computed even for those sources with astrometric measurements of moderate quality: J04290082+3152597 (RUWE: 1.413), J05000310+3001074 (RUWE: 1.615) and J05005485+3229168 (RUWE: 1.773) and even bad quality J04455129+1555496 (RUWE: 4.6815).

Spectral information on the candidates
In addition to the kinematical information, Gaia DR3 provides data on the effective temperature, surface gravity and metallicity for seven of the thirteen candidate sources (see Table 4).These data are also indicated in the table and compatible with the candidates being late type stars as expected for the TAMC stellar population.
A peculiar source is HO Aur.The metallicity derived by Gaia for this source is very low (-3.9707) and given the high quality of the measurements (RUWE < 1.4), we feel inclined to rely on the data provided by the mission and tentatively identify HO Aur as a Population II field star.
LAMOST spectra are available for few sources and are compatible with the effective temperatures derived by Gaia.In particular, the spectra of J04510713 + 1708468 and J05240794 + 2542438 are compatible with late M spectral type and display strong TiO bands.We have downloaded the spectra from the Vizier server and measured the TiO-5 index (Cruz & Reid 2002) for these sources obtaining values of 1.28 and 0.85, respectively which confirms their M-type classification.The spectra also display a strong Na I absorption feature at 8183/8199 Å (see Fig. 8).which is gravity sensitive (Martín et al. 2010) unfortunately, the spectral resolution is too low to cross-check the effective gravity determined by Gaia for these sources.The spectra also show Li I absorption at 6708 Å but again, equivalent widths are very uncertain and thus, it is difficult to provide an age estimate of the sources.
The most noticeable characteristics of the spectra of these stars are the prominent emission lines, in particular the Balmer series and the Ca II doublet (see Fig. 8).This indicates that the two M dwarfs are young and possibly accreting (see Basri 2000 and references therein).This is consistent with the method used by GdC2015 to search for TTSs candidates that was based on selecting sources that was based of the UV excess of the sources; 2MASS J04510713+1708468 has FUV −NUV = 1 and J −K = 0.9 and 2MASS J05240794 + 2542438 has FUV − NUV = 1.4 and J − K = 0.97.
HD 281691 is a G8 star located in front of the TAMC, at a distance of 108.7 pc, according to the Gaia parallax and its proper motions are significantly different from those of the TTSs in the system.HD 281691 has a nearby companion located at projected distance of 6.78 arcsec (738 au) (Janson et al. 2013) and it is often included in surveys for debris disk (Meshkat et al. 2017) .The infrared excess from the disk and the UV emission from the active star resulted in its detection by the GdC2015 survey.HD 30171 (J04455129+1555496) was detected as a TTS candidate in the GdC2015 survey and it is known to be a WTTS since the early searches for Li I absorption of the X-ray sources detected by the ROSAT satellite in TAMC (Wichmann et al. 2000).As such, it is included in the catalogue used for qualification of the TTSs sample (Joncour et al. 2017) and it is a clear member of Group 2. V600 Aur is an active X-ray source which was included in the sample studied by Xing (2010) for the determination of the Li I equivalent width (see Table 1 in the article, [LH98] 173).Its rapid rotation period (2.201 days) and high Li I abundance support its pre-main sequence nature and TAMC membership.Note that though its kinematics deviates apparently from that of the TAMC members, this may be caused by the low quality of the Gaia measurements (RUWE=3.529).
J04590305+3003004 and J05000310+3001074 are included in the Catalogue of Stars in the Northern Milky Way Having Hα in Emission (Kohoutek & Wehmeyer 1999) and display infrared excess emission (Luhman 2023).Recently, J04590305+3003004 has been reported as a 8.8 Myr old dipper star, a subgroup among the young stellar objects that exhibit dimming variability in their light curves (drops in brightness by 10%−50%), attributed to the occultation of the star by the circumstellar disk material (Capistrant et al. 2022).
Little or no information is available about J04290082+3152597, J04423721+3401492, J04595003+3049082, J05005485+3229168 and J05050770+2923477.Two of them, which according to our work are very good candidates to being members of the sparse young stellar population of the TAMC however, J04423721+3401492 kinematics differs significantly from that of the TAMC members and the quality of the Gaia measurements is quite good (RUWE=1.482).
In summary, 3 out the 13 candidates identified in the GdC2015 survey are known pre main sequence stars namely, HD 30171, V600 Aur and J04590305+3003004.Another two are not TTSs associated to the TAMC; HD 281691 is a field G8-type star located in front of the TAMC with a nearby companion at a projected distance of 738 AU and HO Aur has a too low metallicity to belong to the TAMC.The rest (excluding J04423721+3401492) are very reliable candidates and two of them, in particular, have been confirmed as new late type TTSs from their LAMOST spectrum: J04510713+1708468 and J05240794 + 2542438.

Discussion: dynamical coupling between the TAMC and the TTSs in the region
The radial velocity of the TTSs also provides fundamental information to determine the coupling between the molecular gas in the TAMC and the young stellar population (see e.g.Gomez de Castro & Pudritz 1992).This determination is relevant for two purposes.Firstly, it indicates the degree of dynamical relaxation of the young stellar association.Also, if the coupling is strong, this information can be used to estimate the motion of the molecular gas in the plane of the sky which otherwise, is unfeasible; if the dynamical coupling between the gas and the stars is strong (if the radial velocities are similar) then, the motion of the clouds in the plane of the sky can be inferred from the stellar motion and the 3D velocity vector of the molecular gas can be restored.
There have been several attempts to examine this coupling, in particular that by Galli et al. (2019) based on Gaia DR2 data.In their approach, they selected several TTSs (28) and compared their radial velocity with that of the molecular gas in the nearest area within the complex (see Fig. 15 in Galli et al. 2019).There is a well known velocity gradient in the TAMC which is observed in the 12 CO and 13 CO maps (Goldsmith et al. 2008;Narayanan et al. 2008); the radial (V LS R ) velocity decreases from 8.5 to 5.5 km s −1 roughly as Galactic longitude increases (see Fig. 7 to 16 in Narayanan et al. 2008).Galli et al. (2019) attempted to search traces of this gradient in the stellar radial velocity with uncertain results.In their work, they selected the TTSs with good radial velocity measurements from Gaia DR2 and extracted from the 13 CO data cubes the velocity of the nearby (in projection) molecular gas (the beam size of the radio maps is 45" and the Nyquist-samplex pixel ∼20").They found that stars near L1495, B213 and B216 (which are at l gal ∼ 170 o ) have larger V LS R than stars at L1536 (l gal ∼ 175 o ) but the correlation is not good further than these two extreme groups.Moreover, the uncertainties of the 13 CO V LS R measurements are high given the large broadening of the 13 CO profiles and the extent of the area within the beam.In summary, though the attempt is to be commended the results are non-conclusive.
A simpler, statistical approach may help to examine the starsgas coupling.There are about 6.2 million stars with galactic coordinates ) in the Gaia DR3 survey.This number reduces to 697,990, if only sources with accurate radial velocity (relative error < 10%) are included.Among them, only 20,806 are located at a distance compatible with TAMC membership.We have evaluated the radial velocity distribution (V LS R ) for these field stars and compared it with that obtained for the TTSs with high quality Gaia DR3 radial velocity measurements (66 TTSs of those 40 are WTTSs and 26 are CTTSs).The velocity distribution of the field stars in the area is very broad with a slight asymmetry towards bluewards shifted velocities however, the velocity distribution of the TTSs (both WTTSs and CTTSs) peaks at redwardshifted velocities of ∼ 2.5 km s −1 (see Fig. 9).The Kolmogorov-Smirnov test rejects the null hypothesis (both distributions are the same); the p-value is 1.32×10 −12 .Thus, as otherwise expected, the TTSs differ from field stars also in radial velocity.The relevant issue for the sake of this discussion is that the peak frequency of the TTSs velocity is smaller than the V LS R of the TAMC thus, the TTSs as a whole seem to be already dynamically decoupled from the cloud.
Unfortunately, the statistic is too poor to seek for differences between CTTSs and WTTSs.In spite of the apparent slight bias of the CTTSs towards a higher V LS R hinted at in Fig. 10, the Kolmogorov-Smirnov test clearly indicates that both CTTSs and WTTS distributions are independent samples of the same underlying distribution with p-value 0.998.

In summary
In this work, Gaia DR3 astrometric data have been used to review the list of candidates to TTSs selected by GdC2015 on the basis of their UV and IR colours.From the initial compilation of 63 sources, only 10 (16%) have been found to be reliable members (shared kinematics and location with well known members of the association).Most of these sources are located at low galactic latitudes, in the Auriga-Perseus area.Gaia spectroscopic information has shown that one of candidates pinpointed in the GdC2015 list, HO Aur, is a very low metallicity, population II star which cannot be member of this young association.All the rest of the sources are very likely TTSs; in particular, two of them have been identified as accreting late-type dwarfs from their LAMOST spectra: 2MASS J04510713+1708468 and 2MASS J05240794+2542438.This work has also produced some additional results.Firstly, a clean, reliable list of TTSs for astrometric studies has been produced.In addition, the analysis of the radial velocity (V LS R ) of the TTSs in comparison to those of the field stars and the molecular cloud has shown that the association has clearly distinct kinematics and that the TTSs are no longer coupled to the molecular gas in the TAMC.As the sample is dominated by WTTSs, this suggests that the decoupling is already significant 10 − 100 Myr after star formation begun.

Appendix A: Impact of uncertainties on our clusterings analysis
In order to account for the data uncertainties, we used a Monte Carlo sampling technique to generate 104 synthetic datasets.
For each one, we applied the k-means++ algorithm, keeping track of the number of groups generated and the positions of the centroids (the nearest neighbour was used as proxy to obtain physical coordinates).Therefore, for each source in our list of bona-fide TTSs, we generated 10 4 realizations of its astrometric parameters.Gaia DR3 provides the correlations between each pair of astrometric parameters.Given two parameters, x and y, with standard deviations σ x and σ y , their respective covariance and correlation coefficients, σ xy and ρ xy , can be written as σ xy = ρ xy σ x σ y .If C is the covariance matrix at a given epoch associated with the astrometric solution that is symmetric and positive-semidefinite, then C = A A T where A is a lower triangular matrix with real and positive diagonal elements, A T is the transpose of A. In the particular case studied here, these matrices are 5 × 5.If the elements of C are written as c ij = ρ ij σ i σ j and those of A as a ij , where those are the entries in the i-th row and j-th column, and if r is a vector made of univariate Gaussian random numbers (components r i with i = 1, 5), the required multivariate Gaussian random samples are given by the expressions: α c = α + a 11 r 1 δ c = δ + a 22 r 2 + a 21 r 1 π c = π + a 33 r 3 + a 32 r 2 + a 31 r 1 µ α c = µ α + a 44 r 4 + a 43 r 3 + a 42 r 2 + a 41 r 1 µ δ c = µ δ + a 55 r 5 + a 54 r 4 + a 53 r 3 + a 52 r 2 + a 51 r 1 , (A.1) where α, δ, π, µ α , and µ δ are the values of right ascension, declination, absolute stellar parallax and proper motions in right ascension and declination directions provided by Gaia DR3 and the a i j coefficients are given by: where σ α , σ δ , σ π , σ µ α , and σ µ δ are the standard errors in right ascension, declination, parallax and proper motions from Gaia DR3, and ρ α δ , ρ α π , ρ α µ α , ρ α µ δ , ρ δ π , ρ δ µ α , ρ δ µ δ , ρ π µ α , ρ π µ δ , and ρ µ α µ δ their respective correlation coefficients, also from Gaia DR3.For each π c , we computed the value of the distance by applying the usual relationship, d c = 1/π c .
For each set of synthetic sources statistically compatible with the qualification sample, we compute the proper motions along the Galactic coordinates as pointed out above and use their values together with the parallaxes to apply the k-means++ algorithm.The resulting centroids are shown in

Fig. 1 .
Fig. 1.Map of the dust distribution in the Taurus, Auriga, Perseus star forming complexes obtained by the Far Infrared Surveyor on board the AKARI satellite.The location of the main young associations is indicated as well as that of T Tau and AB Aur, two prominent pre-main sequence stars in the region.The yellow frame marks the area surveyed by GALEX and searched for young stars by GdC2015.

Fig. 2 .
Fig. 2. Candidates to TTSs in the TAMC from GdC2015.Top panel, the location of the TTSs and candidates on the sky is overlaid on the density of GALEX NUV sources, in galactic coordinates; stellar densities are color coded.The density of molecular gas is outlined from the 2MASS extinction map by Lombardi et al. 2010).Bottom panel, color-color diagram used for the selection of the TTSs candidates in GdC2015.Candidates are marked with black crosses, and known CTTSs and WTTSs from the qualification sample are represented by blue squares and red circles, respectively.The regression line marking the location of the WTTSs in the diagram is plotted (solid green line) as well as the uncertainty band from the fit (dashed green lines); most of the TTSs candidates to be analyzed in this work are within this strip.

Fig. 3 .
Fig. 3. Density of Gaia sources in the TAMC.The densities are color coded in stars per 6 arcmin 2 (see lateral bar).The shadow produced by the TAMC filaments over the stellar background is readily identified.The two over dense regions correspond to NGC 1747 (an open cluster at 550 pc) and NGC 1746, an asterism or apparent concentration of stars (Cantat-Gaudin & Anders 2020), are easily identifiable.The location of the GdC2015 candidates with Gaia counterparts within 1200 pc is marked by circles and asterisks, color coded by distance.Asterisks represent sources with high quality Gaia measurements (RUWE ≤ 1.4) and circles sources with RUWE larger than this threshold.

Fig. 4 .
Fig.4.Proper motions (velocity projected in the plane of the sky) of the known TTSs in the TAMC with high quality astrometric measurements (RUWE < 1.4).CTTSs and WTTSs are represented with blue and red arrows, respectively.Note that the motion of the stars is not parallel to the filaments and it is approximately perpendicular to the direction of the magnetic field.

Fig. 5 .
Fig. 5. Histogram of the distribution of the ratio µ b /µ l cos(b) obtained for the known CTTSs (blue) and WTTSs (red) in the TMC.

Fig. 6 .
Fig. 6.Location of the two groups identified by the decision tree algorithm are represented.Top: sources location, middle: proper motions of the sources in Galactic coordinates and bottom: inclination of the proper motion with respect to the Galactic plane versus distance.The two groups are identified by orange squares (group 0) and violet circles (group 1).

Fig. 7 .
Fig. 7.As in the bottom panel of Fig. 6.The location of the 13 candidates to TTSs examined in this work is indicated and labeled according to their entry in Table3.The candidates are marked according to their RUWE value: green diamonds (RUWE ≤ 1.4) and black dartboards (RUWE > 1.4).

Fig. 8 .
Fig. 8. LAMOST spectrum of the two newly identified accreting brown dwarfs in the TAMC.Top: LAMOST spectrum; the Balmer series and the TiO bands are readily identified.Bottom: Zoom around the Li I feature.

Fig. 9 .
Fig. 9. Histogram of the frequencies of the distribution of stars in the TAMC field in terms of V LS R .The distribution of field stars is compared with that of the bona fide TTSs.

Fig. 10 .
Fig. 10.Histogram of the frequencies of the distribution of TTSs stars in terms of V LS R .The distribution of the CTTSs is compared with that of the WTTSs.
Fig. A.1 in red.

Fig. B. 1 .
Fig. B.1.Final distributions for the logistic regression coefficients after 10 4 Monte Carlo realizations.The solid black line corresponds to the median value adopted as the best-fit, while the red dashed line corresponds to the value obtained from the fit of the original variables without taking errors into account.

Table 1 .
Article number, page 4 of 12 Gomez de Castro et al.: New TTSs in the Taurus-Auriga MC Sources misclassified as TTSs in the TAMC

Table 2 .
Qualification Sample for the Kinematics of the TTSs in the TAMC * *Full table is available on-line

Table 3 .
Classification of the 13 TTs candidates after applying the Logistic Regression model with variables ( π , µ l gal cos b gal , µ l gal ).