Clustering the properties of near-Earth objects: physico-dynamical links among NEOs

Context. At present, near-Earth objects (NEOs) are being discovered at an ever-increasing rate. However, their physical characterisation is still signiﬁcantly lagging behind. In particular, the taxonomic classiﬁcation of newly discovered NEOs is of great importance with regard to improving our understanding of the population of NEOs. Aims. In this context, our goal is to probe potential links between orbital properties of NEOs and their composition. We investigate whether we can make a reasonable guess about the taxonomic class of an NEO upon its discovery with a decent orbital accuracy. Methods. We used a G -mode multivariate statistical clustering method to ﬁnd homogeneous clusters in a dataset composed of orbital elements of NEOs. We adopted two approaches, using two sets of variables as inputs to the G -mode method. In each approach, we analysed the available taxonomic distribution of resulting clusters to ﬁnd potential correlations with several unique parameters that distinctively characterise NEOs. We then applied a dynamical model on the same clusters to trace their escape regions. Results. Approach 1 (A1) led us to obtain NEO clusters that can be linked to a primitive composition. This result was further strengthened by the dynamical model, which mapped outer-belt sources as escape regions for these clusters. We remark on the ﬁnding of a cluster akin to S-type NEOs in highly eccentric orbits during the same approach (A1). Two clusters, one with small NEOs in terrestrial-like orbits and one with relatively high inclinations, were found to be common to both approaches. Approach 2 (A2) revealed three clusters that are only separable by their arguments of perihelion. Taken altogether, they make up the majority of known Atira asteroids. Conclusions. For an NEO whose orbit is relatively well determined, we propose a model to determine whether the taxonomy of an NEO is siliceous or primitive if the orbital elements of the NEO fall within the presented combinations of inclination, eccentricity, and semi-major axis ranges.


Introduction
Asteroids are characterised by a dynamic spectrum of varying compositions and spectral taxonomies across heliocentric distances. Due to the Yarkovsky force and orbital resonances with giant planets, some asteroids or collisional fragments of asteroids end up leaving the main asteroid belt and arrive in the vicinity of Earth, in which case they are referred to as near-Earth objects, usually referred to as NEOs (Bottke et al. 2002;Morbidelli & Vokrouhlický 2003). The study of NEOs is crucial to improving our understanding of the origin, formation, and evolution of the Solar System, and is closely linked to understanding the origins of life on Earth. Most importantly, they are time capsules that contain primitive material that is less altered, The NEO dataset is only available at the CDS via anonymous ftp to cdsarc.cds.unistra.fr (130.79.128.5) or via https: //cdsarc.cds.unistra.fr/viz-bin/cat/J/A+A/674/A50. Our NEO dataset with available taxonomy can be obtained upon request. thus preserving compositional signatures of the early solar nebula. In particular, the compositional, morphological, and orbital characterisation of NEOs sheds light on the delivery of water and organics (Ehrenfreund & Sephton 2006;Altwegg et al. 2015;Marty et al. 2016) to the prebiotic Earth, while some NEOs could be potential hazards for life on Earth (Perna et al. 2013), as has been witnessed in the past (Brown et al. 2013;Popova et al. 2013). Furthermore, NEOs are of interest for the future of humankind, as they could be useful as vital resources in the course of interplanetary travel. A&A 674, A50 (2023) be subsequently reduced following additional targeted or fortuitous observations. However, its composition (Perna et al. 2018;Binzel et al. 2019) remains unknown until dedicated spectroscopic or spectrophotometric observations are made, although the measurement of albedo could, in some cases, be used as an input to indirectly and naively infer the taxonomic class of an NEO (Bus & Binzel 2002;DeMeo et al. 2009DeMeo et al. , 2015. The absence of a taxonomy at the discovery of an NEO is therefore an inherited disadvantage when studying physical properties of NEOs. Therefore, we have undertaken this study to investigate whether (or to what extent) we could assign a taxonomy for an NEO, upon its discovery, just by looking at the distribution of its orbital elements. As such, in order to probe links between the composition and orbital elements of NEOs, we applied a multivariate statistical clustering method on the currently available population of NEOs to identify homogeneous clusters of NEOs. Then we explored the distribution of available taxonomies of resulting NEO clusters to distinguish any potential correlations. Finally, we synthesised the results by presenting a simple model of making a reasonable guess whether the taxonomy of an NEO is primitive (or not) if its orbital elements fall in the restricted applicable range of orbital elements of the model.

NEO dataset
Our dataset of NEOs comes from the IAU Minor Planet Center (MPC) 1 . We queried NEOs belonging to the dynamical groups Atiras, Atens, Apollos, and Amors from the MPC and filtered them based on their uncertainty parameter (U), introduced by the MPC. The latter is an integer from 0 (least uncertainty) to 9 (highest uncertainty) and is a logarithmic measure of the longitudinal uncertainty of the mean anomaly of an orbit after a time span of 10 yr. We chose to only include NEOs with U ≤ 5 in order to select NEOs with a relatively satisfactory orbital determination for this work. We stress that this is an arbitrary choice to select objects with fairly accurate orbits and, at the same time, to maximise the number of objects with known taxonomy. This led to a sample of 14 132 NEOs and their distribution in terms of aforementioned dynamical groups can be seen on Fig. 1.
We looked for taxonomic data from published literature based on spectroscopic surveys (Perna et al. 2018;Binzel et al. 2019;Devogèle et al. 2019;Ieva et al. 2020;Simion et al. 2021;Hromakina et al. 2021) as well as spectrophotometric surveys (Lin et al. 2018). The rest of the taxonomic data are sourced from various small-body databases, which include EARN dataset 2 , Near-Earth Objects Coordination Centre managed by the ESA 3 and the Small-Body Database of NASA. These sources yielded taxonomies for 1566 NEOs, a number that represents just over 11% of our entire sample. In Fig. 2, we show the distribution of NEOs as a function of their taxonomic compositional classification. Because of the similarity of some spectral taxonomies and with the aim of making the statistics more meaningful, we associated the F-type to B-type taxonomy and M, E, and P-types to X-type taxonomy.
Albedo data were retrieved for NEOs whenever they were available. They were sourced mainly from the neowise dataset (Mainzer et al. 2019) and findings from the literature.  Taxonomic distribution of NEOs included in this work. The xaxis refers to different taxonomic types and N/A corresponds to NEOs for which taxonomic data are not currently available.

G-mode method
We then applied the G-mode multivariate statistical clustering analysis (Barucci et al. 1987(Barucci et al. , 2005Gavrishin et al. 1992) to NEOs in our sample to identify unimodal clustering of objects with a given statistical precision. The G-mode classification method seeks to identify homogeneous clusters of objects within a sample of N objects (NEOs in this case), described by M variables (NEOs' orbital elements and other ancillary parameters of choice) without a priori clustering criteria and accounts for the uncertainties of variables of each object. The method involves transformation of the input multivariate sample into a univariate sample through iterative steps. The assignment of an object to a given cluster is defined on statistical inference rules. The confidence level defines the probability of accepting the hypothesis that an object belongs to a given cluster and this is the only a priori criterion set by the user in the G-mode method. The objects assigned to each cluster follow a Gaussian distribution normalised to mean 0 and standard deviation 1. Depending on the confidence level, which corresponds to a given critical value q1, expressed in terms of σ, objects are either assigned to a cluster or discarded. For example, the critical value of q1 = 3 corresponds to classifying the entire sample with a probability of 99.7% (3σ level of the standard normal distribution) making the right decision when assigning an object to a cluster. The higher the q1, the less populous the output clusters will be and vice versa. Hence, a G-mode run will result in one or more clusters (depending on the choice of q1 and the statistical nature of the dataset) as well as a group of discarded objects generally referrable as outliers.
We refer to the aforementioned literature for a detailed mathematical description of the method and to Tosi et al. (2005); Perna et al. (2017); Barucci et al. (2019), and Bott et al. (2022) for the implementation of G-mode method in different types of datasets. We have used the python implementation of the G-mode code by Hasselmann et al. (2013) in this work.
We used two approaches in terms of input variables to the G-mode method: (1) inclination (i), eccentricity (e), and semimajor axis (a) of the orbit of a given NEO and their respective uncertainties; (2) same as (1) with the inclusion of argument of perihelion (ω), Tisserand parameter with respect to Jupiter (T J ), and the absolute magnitude (H) of a given NEO and their respective uncertainties. The variables i, e, and a used in the first approach effectively fix the orbit of an object in space with respect to the Sun and are primarily of a dynamic interest. The three ancillary variables ω, T J , and H are included in addition in the second approach due to physical reasons. Here, H can be directly translated to the physical dimension of an NEO for an assumed albedo, while T J can link NEOs on cometary-like orbits and ω can be used to identify Jupiter-family comets (JFCs) due to clusterings at ω = 0 • and ω = 180 • (Sosa et al. 2012). In addition, ω can be exploited to find young collisional families, as they tend to conserve ω (Carruba et al. 2018).

Results
First we present the results with the approach 1 (A1) with three input variables followed by those with the approach 2 (A2) with six input variables. For each G-mode run, statistics of resulting clusters are tabulated in the Appendix A. Each table is followed by a tabulation of the taxonomical distribution (subject to availability) corresponding to the given G-mode run. We note that different G-mode runs in each approach are mutually independent and, therefore, the order of the clusters may change from one G-mode run to another. To facilitate the tracking of a given cluster across G-mode runs of varying confidence levels, we refer the reader to respective sankey diagrams available for each approach.

Approach 1 (A1)
3.1.1. A1 at q1 = 3.0 We started off by applying the G-mode method to our dataset at a confidence level of 3σ (or q1=3). This resulted in only one homogeneous cluster of NEOs, denoted as C1 4 and the discarded group of outlier NEOs. In Table A.2, in addition to reporting the mean values of i, e, and a, which were the input parameters to the G-mode method, we have also included mean values of several ancillary parameters of interest, in order to better understand the homogeneity of clusters. These are the minimum orbital intersection distance to Earth (MOID E ), perihelion distance (q), aphelion distance (Q), Tisserand parameter with respect to Jupiter (T J ), and the absolute magnitude (H). The median absolute deviations are also reported in Table A.2 with the mean values. The fact that G-mode returned only one cluster at a confidence level of 3σ and that it contains more than 98% of the NEOs in our sample implies that the global dataset is homogeneous in terms of the three orbital elements used as inputs. We compared the mean parameters of this homogeneous cluster with those of the entire NEO population, discovered as of 27/10/2021. There are 26985 NEOs in this population and we report their statistics in Table A.1. This comparison led us to establish that this homogeneous cluster is representative of the entire population of NEOs. For brevity, from now on we refer to this cluster as the background population. The group of outliers, on the other hand, clearly differs from the background population in terms of its mean inclination, which is about 55 • and its mean absolute magnitude, which is about three magnitudes brighter. We note that H remains the parameter that varies the most between entire population of discovered NEOs and the obtained cluster. This is because we restricted our input NEO sample to objects with an accurate-enough orbital determination: due to survey limitations, the largest objects (smaller H) have likely been discovered earlier and tend to have more accurate orbits.
3.1.2. A1 at q1 = 2.9 At a confidence level of 2.9σ, apart from the background population, it is possible to see the emergence of a second cluster from the sample, as reported in Table A.3. A careful comparison with the results obtained at the confidence level of 3σ reveals that the newly emerged cluster is directly related to the outliers group obtained at the confidence level of 3σ (refer to the sankey diagram on Fig. 3). In fact, the outliers group in this run at a confidence level of 2.9σ appears interesting due to its mean semi-major axis, eccentricity, and T J , all of which correspond to cometary orbital elements. In particular, the mean T J which is below 2, suggests that this outlier group may contain objects belonging to the family of Damocloids. Damocloids have orbits similar to those of long-period comets but without any observed outgassing and are thought to be dead or dormant cometary nuclei (Jewitt 2005). Indeed, by inspecting the individual members of this outlier group, we found that 1999 XS 35 , 343 158 Marsyas, 2014 PP 69 , 2020 BZ 12 all have T J < 2 and that all have retrograde orbits except for 1999 XS 35 . These four NEOs represent 25% of the outliers in this G-mode run. Furthermore, 3552 Don Quixote, an NEO with D-type taxonomy and identified cometary activity (Mommert et al. 2014) was also found among the members of this outlier group. Recent work (Fatka et al. 2022) has shown using broadband photometry, that 2019 PR 2 and 2019 QR 6 , 2 NEOs with similar orbits found in this outlier group, have a taxonomy similar to D-types. It is known that D-types are a rare taxonomy category among NEOs. These authors modeled cometary activity for this pair of NEOs and the results indicate a very recent formation (not earlier than 420 years ago). With these results, this outlier group ends up with 19% of NEOs with D-type taxonomy. Finally, we note three NEOs: 343158, 2014 PP 69 , 2020 BZ 12 with retrograde orbits in this outlier group, which corresponds to 19% of the objects. These dynamical and compositional constraints therefore strongly suggest a cometary origin for the objects in this outlier group. This diagram corresponds to the first approach where three input variables were used. The colours correspond to an Nth cluster in a given G-mode run, i.e. the second cluster of any G-mode run will have the same colour.
3.1.3. A1 at q1 = 1.9 As we kept lowering the value of q1 (Fig. 3), hence the confidence level too, we noticed the emergence of three clusters at a confidence level of 1.9σ (or q1 = 1.9). They are reported in Table A.4 and as expected, it is possible to identify the G-mode cluster C1 as the background population. C2 and C3 are unique in this run and are well constrained by their mean inclination, eccentricity, semi-major axis, and T J . It can be noted that C3 has 2 < T J < 3, which is a dynamic criterion used to identify JFCs. We also note that this cluster has 4/51 (only seven NEOs have taxonomy available) NEOs belonging to a primitive composition (two B-type and two D-type taxonomies), which is an independent result that implies the presence of cometary nuclei in this cluster. Their relatively smaller mean perihelion distance (q∼0.34 ) and relatively higher mean eccentricity (e∼0.88) imply that the NEOs in this cluster undergo drastic thermal variations.
3.1.4. A1 at q1 = 1.78 At q1 = 1.78, we found eight clusters. Their distributions in terms of e, a, and i spaces can be visualised in Fig. 4. As shown in the left panel of the this figure, the distributions of i versus a and i versus e clearly demonstrate the separation among all the eight clusters. As can be seen from Table A.5, we notice the familiar C1, representing the background population; C3 which contains 47 NEOs, appears to share similar mean eccentricity and semi-major axis as the Earth and a very low mean inclination (i ∼ 3 • ). Furthermore, this cluster is well behaved (with lowest dispersion of data points) in terms of its mean absolute deviations for almost all the orbital elements and ancillary parameters reported in Table A.5. We also note that this cluster has the lowest mean MOID E (∼0.01 au) and the highest absolute magnitude (H ∼ 27) among all the clusters. The latter is about five magnitudes darker than the threshold absolute magnitude that defines a potentially hazardous object (PHO), and if we were to assume a typical albedo 5 (p V ), the objects in this cluster would be relatively small in size (Pravec & Harris 2007). There are two NEOs for which taxonomy data are available among the 50 members of this cluster and they are of B and S-types. A B-type is a part of C-complex taxonomy, which has one of the lowest albedos among taxonomy classes and S-type has relatively high albedos. However, we do not currently know how significant this result is since the taxonomies of other 48 NEOs in this cluster currently remain unknown. In order to gain an insight on their physical dimensions, we derived possible ranges of diam- for the members of this cluster and the resulting histogram is shown in Fig. 5. We chose 0.147 as the mean albedo following Morbidelli et al. (2020), when deriving the range of diameters. As made evident in the results shown  2023), which corresponds to 58 NEOs that are co-orbital with Earth. The sample used in our work contains 41 of those 58 and in this specific G-mode run, C3 captures 11 (19%) of those 58, whereas the remaining 30 are captured in C1. We explored why this might be the case and looking at their statistics, it appears that the 30 NEOs that ended up in C1 have higher mean eccentricities (∼0.2), whereas the 11 NEOs in C3 have lower eccentricities (∼0.03). Consequently, the mean eMOID values are higher among those 30 NEOs, whereas the 11 NEOs have lower eMOID values. Therefore, C3 captures low eccentric NEOs that are co-orbital with Earth. As can be seen in Fig. 4, this cluster is well concentrated and is located at extremes of the distributions of e versus a, i versus a, and i versus e.
The C4 resulted from the same G-mode run (Table A.5) appears to share some similarities with the C3. Apart from being the least populous (21 NEOs) and a relatively well-behaved cluster in terms of the median absolute deviations of its orbital elements and ancillary parameters, this is a cluster of NEOs whose orbits are very close to Earth's orbit. This cluster has a mean inclination of ∼24 • and may represent smaller NEOs given their mean absolute magnitude (H∼22).
We found that C5, C6, and C7 all have a mean 2 < T J < 3, which suggests a possible JFC nature. We noticed above the emergence of a single cluster whose mean T J was between 2 and 3 with the G-mode run with a higher confidence level of 1.9σ. As we have now lowered the confidence level, this Gmode run is able to find subtle variations with which three clusters can be defined, instead of a single cluster, while still having a mean 2 < T J < 3. Not only are these clusters decently separated by their mean T J itself, but also by their mean inclination, eccentricity, perihelion, and aphelion distances, which are also relevant discriminators. C5 resides outside Earth's orbit, whereas C6 has the most eccentric orbits with mean perihelia located inside Mercury's orbit and mean aphelia located close to Jupiter's orbit. This makes C6 undergo significant thermal variations, similar to the C3 found in the G-mode run with q1 = 1.9 earlier. C7 has its mean perihelia inside Earth's orbit and mean aphelia beyond Jupiter's orbit. In terms of available taxonomy, we found that C5 has three C-type, two D-type and one X-type objects, which corresponds to a primitive composition. For C7, we found two B-type, three C-type, two D-type, one S-type and two X-type objects, which also points towards a primitive composition. However, for C6, there are three D-type, one R-type and four S-type objects, which suggests the presence of NEOs of both carbonaceous and silicate-dominated (olivine-, pyroxenerich minerals) compositions. Nevertheless, we offer the caution that this result is obtained from a small number of objects with taxonomic data, which is less than 20% of all the objects in any of these three clusters.
Further lowering q1 resulted in a rapid growth in the number of output clusters and these were also found to be less significant, thus challenging the stability and the definition of a cluster. Therefore, we concluded our investigation of clusters at q1 = 1.78 for A1.

Approach 2 (A2)
3.2.1. A2 at q1 = 3 This approach (A2) includes six input variables as mentioned earlier and at the initial confidence level of 3σ, we find one homogeneous cluster of NEOs. The mean values of input variables and several ancillary parameters of interest are reported in Table A.6. The results indicate that more than 99% of the NEOs in our sample belong to one cluster, meaning that the entire sample of NEOs is homogeneous in terms of the six variables used at the confidence level of 3.0σ. The mean values of this cluster are comparable to the background population of NEOs (Table A.1). As was the case with the first approach (A1) at the confidence level of 3σ, the mean inclination and mean absolute magnitude remain the major discriminators between the homogeneous cluster of NEOs and the group of outlier NEOs.
3.2.2. A2 at q1 = 2.5 As we kept lowering q1, we noticed the emergence of a second cluster of NEOs (C2) at the confidence level of 2.5σ (Table A.7). From the sankey diagram of the Fig. 6, it is possible to notice that this second cluster gets seeded from the group of outlier NEOs with high inclination. The group of outlier NEOs resulted from this G-mode run have moderate inclinations with a mean T J ∼ 6.8 and are relatively smaller (mean H ∼ 22.6 ).

A2 at q1 = 2.3
We observed the emergence of a third cluster of NEOs (C2) at the confidence level of 2.3σ (Table A.8) and this cluster corresponds to a cluster of smaller 47 NEOs (mean H ∼ 29.4). This cluster is well-behaved in terms of the median absolute deviations of its orbital elements and ancillary parameters. We remark that a similar cluster (C3) of NEOs was found with the first approach (A1) at the confidence level of 1.78σ and NEOs in that cluster were slightly brighter or larger than this third cluster found using the second approach (A2). Despite the similar distributions in the orbital elements, the mean perihelia of these two clusters slightly differ from each other. A50, page 6 of 13 Deshapriya,J. D. P.,et al.: A&A proofs,. Sankey diagram showing the transfer of objects among different clusters across G-mode runs of varying confidence levels from 3σ to 2σ. This diagram corresponds to the second approach where six input variables were used. The colours correspond to an Nth cluster in a given G-mode run, namely, the second cluster of any G-mode run will have the same colour.
3.2.4. A2 at q1 = 2.2 A fourth cluster emerged at q1 = 2.2 and it corresponds to moderately inclined NEOs (mean of i ∼ 16.5 • ) with relatively smaller semi-major axes (mean of a ∼ 0.7 au), as reported in Table A.9. This characteristic distribution of orbital elements yields this cluster relatively higher T J values (∼7.8). It is also possible to notice that C2 (at q1 = 2.3) has now grown, almost doubling in size, thanks to the influx of NEOs from C1 and from the group of outliers at q1 = 2.3, as evident from the sankey diagram of Fig. 6. 3.2.5. A2 at q1 = 2.0 The clusters we obtained at q1 = 2.0 can be visualised with respect to e versus a, i versus a, and i versus e spaces in the right panel of Fig. 4. Their separation can best be visualised in the i versus a space -although C3, C4, and C5 are grouped together in this figure. Combining the information from the sankey diagram of Fig. 6 and from Table A.10, at q1 = 2.0, we notice that the latest cluster (C3 at q1 = 2.2) gets refined (including the removal of several outlier NEOs) as it gets sub-divided into three clusters with the main discriminator being the argument of perihelion ω. As reported in Table A.10, C3, C4, and C5 are only distinguishable by their mean ω values, which are also well behaved given their respective mean absolute deviations. In addition, we remark that previously found clusters remain stable and fed by an ongoing influx of NEOs, in spite of a confidence level that is consistently reduced, which makes a case for their stability.
We also highlight that C2 is stable and has grown in size with 194 NEOs at this confidence interval of 2.0σ. Its mean absolute magnitude of 28 implies that this cluster continues to contain relatively smaller NEOs. We made a histogram to visualise the estimated diameters of this cluster (Fig. 7). These estimations are closely comparable to the NEOs of cluster C3 resulted from A1 at 1.78σ (Fig. 5). It appears that both clusters sample a continuum of objects that are relatively small, characterised by a unimodal distribution peaking in the diameter range from 5 to 15 m. This can be further seen on Fig. 4, where they can be closely compared in the e versus a, i versus a, and i versus e spaces. We see that 36 out of 47 (>75%) NEOs in the cluster C3 resulted from A1 at 1.78σ find themselves in the cluster C2 resulted from A2 at 2.0σ, which reinforces the presence of a population of small NEOs with orbital elements that are similar to those of Arjuna-type asteroids (Tables A.5, A.10).
As for the C6, which represents NEOs of high inclination (mean i ∼ 52 • ), it has grown in size compared to previous confidence level of 2.2σ. It proved that clusters began to lose their significance and become less meaningful if q1 was further lowered below 2.0 for A2. Hence, we concluded the analysis for A2 at q1 = 2.0.

Potential escape regions for obtained G-mode NEO clusters
As reported up to now, our results are based on orbital and several ancillary NEO data applied to G-mode statistical tool and on available NEO taxonomical data. To expand on the dynamical aspect of this work, we probed potential source regions for NEO clusters resulted from our G-mode runs. Thus, we implemented an investigation similar to what was implemented by Binzel et al. (2019) to estimate a probability distribution for our G-mode clusters, taking into consideration a dynamical model of seven escape regions in the main asteroid belt. It is assumed that an object must escape from an escape region where it has held long-term residence dating back to the early Solar System, prior to entering the near-Earth space. This dynamical model (Granvik et al. 2016(Granvik et al. , 2017(Granvik et al. , 2018, the Granvik model hereafter, traces with scrutiny the escape paths of objects from the main-belt to near-Earth space and the model accounts for the Yarkovsky effect (Vokrouhlický et al. 2015). Given the orbital elements and absolute magnitude of an NEO, the Granvik model resolves a discrete probability distribution function, built up of seven values for following seven escape regions: (1) Hungaria asteroid family; (2) Phocaea asteroid family; (3) Resonance complex around ν 6 secular resonance; (4) Resonance complex around 3:1 mean motion resonance with Jupiter; (5) Resonance complex around 5:2 mean motion resonance with Jupiter; (6) Resonance complex around 2:1 mean motion resonance with Jupiter; (7) JFCs. As such, assuming that the NEOs belonging to a given cluster are represented by the mean values of the variables defining the cluster, we used the mean orbital elements and mean absolute magnitude of a cluster as inputs to the Granvik model (as if each cluster were a single NEO). For each approach, we selected the G-mode run with the largest number of clusters, as they represent more diversity, namely: clusters at 1.78σ for A1 and clusters at 2σ for A2 (Fig. 8).
As for approach 1 at 1.78σ, the results indicate that most of the clusters have higher probability to emanate via ν 6 secular Fig. 8. Probability distributions for the G-mode clusters obtained at 1.78σ for A1 and at 2σ for A2 to emanate from any of the seven escape regions according to the Granvik model (Granvik et al. 2018). The numbers correspond to each of the clusters, represented by a unique colour. The group of outlier NEOs in each G-mode run is represented by the black colour represented by 0. The error bars correspond to the error of the mean for the nominal probability. resonance, which is known to be the most dominant escape path from the main-belt to the near-Earth space (Bottke et al. 2002). In addition, we observe that C5 and C7 have significant probabilities to emanate from outer-belt sources, such as a 2:1 mean motion resonance of Jupiter and JFCs. These two clusters have been attributed to a primitive composition as we demonstrate earlier in this paper, based on their taxonomy and T J , which are two independent pieces of evidence. Thus, our dynamic modeling reinforces and corroborates these existing results, pointing towards a primitive origin for the NEOs in these two classes. When it comes to approach 2, all the clusters obtained at 2σ have a greater probability of emanating via ν 6 secular resonance, as expected (Bottke et al. 2002).

Discussion
We implemented a G-mode analysis on NEOs structured on two distinct approaches based on input variables. During A1, we used i, e, and a of NEO orbits as inputs and during A2, we used six input variables which included i, e, a, T J , H, and ω of NEOs.
With regard to A1, we started our G-mode analysis at a higher confidence level of 3σ and gradually lowered it down to 1.78σ. The most highly populated cluster emanating from the background population of NEOs was present in all the Gmode runs as it was the most dominating at all the confidence levels investigated. As far as A2 is concerned, we observed similar behaviour for the most highly populated cluster within the implemented confidence levels from 3.0σ to 2.0σ. C6 obtained at the confidence level of 1.78σ with a mean 2 < T J < 3 had a mixed composition (D and S-types) in terms of available taxonomy. Interestingly, Simion et al. (2021), in their work on spectral properties of NEOs with low T J , report a particular group of 4 S-complex NEOs (e ≥ 0.92, q ≤ 0.3 au), which corresponds to some members of this C6. These authors propose that T J ∼ 3 can act as a compositional border transiting from non-primitive composition to a primitive composition and define the presence of S-type NEOs in highly eccentric orbits with low perihelia as an extreme case. Notably, they report ( Table 6 in Simion et al. 2021) 394130 (2006HY 51 ), 465402 (2008HW 1 ), 455426 (2003, and 331471 (1984 QY 1 ) as NEOs of this particular group and they are all members of our C6 in this G-mode run. Hence, we agree with the fact that T J ∼ 3 can indeed be used as a compositional border as proposed by the aforementioned authors. Our result also corroborates that the presence of S-complex NEOs in eccentric orbits in a regime of 2 < T J < 3 is an extreme case. Having been given only three orbital parameters as inputs, G-mode not only initially finds NEOs corresponding to a cometary origin in terms of T J at a higher confidence level, but also is able to independently find extreme compositionally heterogeneous sub-clusters (presence of S-type NEOs in a regime dominated by NEOs with primitive taxonomies (rich in organics and volatiles emanating from a cometary origin) inside the main cluster with a mean 2 < T J < 3, at a lower confidence level. This is an important demonstration of the versatility of the G-mode method. As reported below Table A.5, we have 36 NEOs for which there is no available taxonomy in this C6. Based on the selection criterion of e ≥ 0.92, q ≤ 0.3 au, we report in Table A.11 the NEOs from C6, whose taxonomy is predicted to be of S-complex. We stress however, that the optical colors of 2002 PD 43 published by Jewitt (2013) do not seem to match this prediction. Hence, more taxonomic data are needed to further verify the accuracy of this method.
We recall that DeMeo & Binzel (2008), based on spectroscopic data from a sample of 55 NEOs, estimated that 54 ± 10% of NEOs with T J < 3 have comet-like spectra or albedos. Our clusters C5, C6 (excluding the extreme NEOs with S-type), and C7 in A1, associated with a primitive composition, capture about 21% of NEOs in our dataset with T J < 3. Although the numbers are not close to each other, our estimate does not appear to be contradictory. In addition, Licandro et al. (2008) analyzed the spectra of asteroids with T J < 3 also known as asteroids in cometary orbits (ACOs) and found that 35/41 of them contain featureless spectra with colours ranging from bluish to red (Btype to D-type), suggestive of a primitive composition. Then, Licandro et al. (2018) studied the spectra of 17 ACOs and 15 of them are of either D or X taxonomy, which also implies a primitive composition for these ACO, which these authors interpret as a strong signal for the presence of extinct or dormant comets. These findings are indeed comparable to the taxonomic distribution in C5, C6, and C7 in A1 (described earlier).
During A2, at the confidence level of 2.3σ we found a cluster (C2) of small NEOs whose stability and continuous growth were observed as the confidence level was decreased down to 2σ.
Given the mean orbital parameters and other reported ancillary parameters of this cluster, we highlight its close relationship with the C3 of A1 obtained at 1.78σ, thereby alluding to Arjuna-type asteroids. This cluster is more than four times more populous than C3, obtained during A1 at 1.78σ and shares more than three quarters of its NEOs with the former cluster. The estimated diameters of both clusters show unimodal distributions peaking in the range of 5-15 m (Figs. 5, 7). Although larger objects are more likely to be observed in general, the observed peak in the smaller dimensions does not necessarily discard the observational bias associated with this result as NEOs do not come close to the Earth at the same rate.
During A2, at the confidence level of 2.2σ, we obtained a fourth cluster of NEOs, characterised by a moderate mean inclination i ∼ 16 • and a mean semi-major axis of 0.7 au, which indicates that it is the cluster with the lowest orbital energy. Consequently, this cluster samples 25 of 37 known Atira asteroids. It is noteworthy that at the confidence level of 2.0σ, this cluster gets sub-divided into three new clusters, with a wellbehaved argument of perihelion being the only discriminator. Although we do not find any evident significant taxonomical difference among the sub-divided clusters, they might hint towards a dynamical difference among themselves, given their well-separated and well-behaved arguments of perihelion.
We see that C8 and C6, obtained during A1 at q1 = 1.78σ and during A2 at q1 = 2.0σ, respectively, appear to share similar orbital elements and ancillary parameters. We compared the NEOs of each cluster individually and, indeed, we found that all the 88 NEOs of C8 obtained during A1 at q1 = 1.78σ are members of the C6 obtained during A2 at q1 = 2.0σ. Therefore, the latter appears to be an extension of the former. We note that the distributions of these two clusters in the i versus a and i versus e spaces in Fig. 4 also hint at this association.
As a general remark, we find that two clusters: (a) smaller NEOs with orbital parameters similar to those of Earth and (b) NEOs with relatively higher inclinations (i ∼ 55 • ), are preserved during both approaches, without considering the background population, which is always represented as C1. Nevertheless, the clusters of NEOs associated with primitive composition during A1 are lost during A2. This is probably due to the inclusion of three additional variables during A2. Following the results obtained during A1, we propose that our method could potentially be used to make a reliable guess whether an NEO with an accurate enough orbital determination could be associated with a primitive taxonomy.
As a verification check on the obtained results, we performed several G-mode runs with different samples of NEOs selected by varying U. For example, we restricted the sample by U ≤ 4, U ≤ 3, U ≤ 2, and even U ≤ 6. Nevertheless, our G-mode runs with varying confidence levels resulted in groupings of clusters similar to those we had already obtained with U ≤ 5, as reported in this work. This provides verification that our results are independent of the method used for the NEO sample selection.

Conclusions
As a response to constrain the physical properties of NEOs, which are being discovered at an increasing rate, we attempted to find a way to make a reliable guess of the taxonomy of an NEO on the basis of its orbital elements. To achieve this, we used the G-mode clustering analysis. Our G-mode analysis of NEOs based on two approaches has led to the identification of unique clusters of NEOs. In each approach, we started at a confidence A50, page 9 of 13 A&A 674, A50 (2023) 40.6 ± 8.64 0.8 ± 0.08 6.4 ± 0.69 P A1 q1=2.90 C0 24.2 ± 2.20 0.6 ± 0.02 3.2 ± 0.09 P A1 q1=1.78 C5 19.5 ± 3.86 0.9 ± 0.02 2.5 ± 0.08 P/S ( * ) A1 q1=1.78 C6 32.0 ± 16.84 0.8 ± 0.05 3.3 ± 0.24 P A1 q1=1.78 C7 Notes. The column named 'Taxo' refers to the taxonomy which is presented twofold: P -primitive (B, C, or D-types) or S -silicaceous (S, Q-types). Reported uncertainties are median absolute deviations from original clusters. Nevertheless, looking at the distributions of i, e, and a of G-mode clusters, we deem that the combination regimes could span three times the reported median absolute deviations when considering if an NEO can be used as an input to this model. ( * ) If q ≤ 0.3 au, the taxonomy is siliceous and otherwise it is primitive.
level of 3σ and gradually lowered it down to 1.78σ and 2σ for A1 and A2, respectively, allowing for more clusters to emerge. A1 allowed us to obtain clusters of NEOs that can be associated with a primitive composition based on their available taxonomy. The dynamical model we implemented to trace the escape regions of obtained G-mode clusters, strengthened this result, as it pointed towards outer-belt sources, correlated with a primitive origin. We also found a cluster of NEOs that can be associated with Stype NEOs in highly eccentric orbits with low perihelia which have been evoked in an earlier study and remarked on as an extreme case by Simion et al. (2021). Accordingly, we predicted (Table A.11) a list of NEOs that would be of an S-type taxonomy, waiting to be validated as soon as their taxonomic data are available.
We found two clusters that are common to both A1 and A2. The first of these is a cluster that samples small NEOs in orbits that are comparable to Earth's orbit and that can be associated with Arjuna-type objects. The second of the shared clusters is the one with relatively high inclinations (i ∼ 55 • ).
In terms of A2, we find three clusters that are only distinguishable by their arguments of perihelion. All of them have very low mean semi-major axes (0.7 au), moderate mean inclinations (i ∼ 14 • ). Together, they comprise a sample of a majority of known Atira asteroids.
In synthesising the methods and results of this work, we propose that it is feasible to attempt to predict whether an NEO is of primitive taxonomy, provided that (1) its orbit is determined with a certain level of accuracy and (2) its orbital elements fall into the restricted combination regimes presented in Table 1.