A Gap in the Mass Distribution for Warm Neptune and Terrestrial Planets

Structure in the planet distribution provides an insight into the processes that shape the formation and evolution of planets. The Kepler mission has led to an abundance of statistical discoveries in regards to planetary radius, but the number of observed planets with measured masses is much smaller. By incorporating results from recent mass determination programs, we have discovered a new gap emerging in the planet population for sub-Neptune mass planets with orbital periods less than 20 days. The gap follows a slope of decreasing mass with increasing orbital period, has a width of a few $M_\oplus$, and is potentially completely devoid of planets. Fitting gaussian mixture models to the planet population in this region favours a bimodel distribution over a unimodel one with a reduction in Bayesian Information Criterion (BIC) of 19.9, highlighting the gap significance. We discuss several processes which could generate such a feature in the planet distribution, including a pileup of planets above the gap region, tidal interactions with the host star, dynamical interactions with the disk, with other planets, or with accreting material during the formation process.


Introduction
Many processes combine to produce the planets we observe, from core formation and accretion at the beginning of a planet's life, to tidal circularization and orbital decay at the end. With the stream of planet discoveries brought in over the past two decades it has become possible to search for traces of these formation and evolution processes and so place observational limits on their action and effects (Winn 2018). One way forward is to study the distribution of planets as a whole, both in terms of planetary and host star parameters. Signatures of their history may remain in the planet population, providing a pathway to testing population synthesis models (e.g., Mordasini et al. 2015).
Several such signatures can be found in the planet radiusperiod plane, such as the Neptunian desert (Mazeh et al. 2016;Owen & Lai 2018) and "radius gap" (Fulton et al. 2017) likely arising from photoevaporation. On a broader scale, the observed occurrence rate of planets has been studied in detail by the Kepler mission, leading to the discovery that Neptuneand Earth-size planets are much more common than those of Jupiter size, as well as providing increasingly improved estimates of the frequency of Earth-like planets (e.g., Fressin et al. 2013;Mulders et al. 2018;Hsu et al. 2019).
In the planet mass-period plane we can expect to view the history of planets from a different angle. While photoevaporation can significantly change a planet's radius, the effect on planetary mass is predicted to be more modest, at least for orbital periods larger than a few days (Owen & Wu 2017;Jin & Mordasini 2018). Current studies of the planet mass population allow investigations of the effects of tides on giant planets (Bonomo et al. 2017) and the occurrence rate of planets out to 20 au (Bryan et al. 2016). A similar Neptunian desert is seen, thought to arise from a combination of tidal interactions with the host star and high-eccentricity migration (Matsakos & Königl 2016;Owen & Lai 2018).
Here we present a new emerging signature in the planet mass-orbital period plane, a gap for planets with mass less than ∼20 M ⊕ and period less than 20 days. The physical reasons for the gap remain unexplained and are left for future exploration, though we provide some plausible hypotheses. These may provide important constrains on planet formation, migration, and star-planet tidal theory.

Planet Sample
We use the confirmed planet sample from the NASA Exoplanet Archive, 6 as of the 2019 May 24. Our prime sample (hereafter P1) consists of all transiting planets with measured masses M p , radii R p , and hence inclination, within the limits M p <25M ⊕ and orbital period P<20 days. These are all transiting planets with masses determined through radial velocities (RVs) or transit timing variations (TTVs). Our second wider sample (hereafter P2) consists of all planets with M p or M i sin p ( ) within the same limits and additionally contains planets with no measured inclination. Only planets with mass measurements better than 3σ were included in either sample. For plots and calculations involving M i sin p ( ), we assumed p = i sin 4 ( ) , the average for an isotropic distribution. We manually check each archive entry for the P1 sample to ensure accuracy. For Kepler-266d and e the values in the archive did not match the best-fit values from the source Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. 6 https://exoplanetarchive.ipac.caltech.edu/ publication (Rodriguez et al. 2018) and were updated; for Kepler-10b we used values from the more recent Rajpaul et al. (2017) as they were inconsistent with the default catalog values, and KOI-142b was removed from the sample as the given source publication (Nesvorný et al. 2013) did not contain a mass measurement for the inner planet. In several cases a more recent publication was available but with results consistent with the archive default value. In these cases, we used the archive default for consistency. After these steps, our P1 sample contained 72 planets, and our P2 sample contained 143 planets including the P1 sample. The full sample, parameters used, and source references are given in Table 1.

The M-P Gap
We plot our planet sample and histogram in Figure 1. A gap in the distribution is seen following a straight line gradient of ∼−1M ⊕ day −1 , with a width of a few M ⊕ . The gap can be seen in both the P2 and P1 samples, although it is clearer for P1, potentially due to the blurring effect of unknown inclination. Below we discuss the relevant observational biases, and conduct tests to determine the significance of the apparent gap.

Observational Biases
As we are using a planet sample drawn from a variety of discovery and characterization programs the biases underlying the sample are complex. We do not attempt to calculate planetary occurrence rates, although we note the TESS satellite should provide a sample of planets more amenable to such a study by the end of its planned mission (Barclay et al. 2018).
All planets are subject to a bias on detection and on characterization. Our P1 sample (72 total) contains planets discovered by the Kepler mission (41), K2 mission (13), TESS mission (4), various other photometric surveys (10), and RV surveys (4). The remaining planets that complete the P2 sample are all discovered through RVs. Full details are in Table 1. Figure 1 shows the various discovery sources in relation to the gap. We note that points both above and below the gap arise from a mix of different discovery missions and methods, indicating that these factors do not cause the observed gap. The gap remains significant if ground-based planet discoveries are removed (Section 3.3).
For planets with known radii, which in our sample were all measured by the transit method, there is a bias against low R R p 2 * ( ) due to decreasing signal strength, and large   . Planets with measured inclination (P1 sample) are colored according to discovery method: purple-Kepler, red-K2, green-TESS, black-radial velocity surveys, cyan-mostly ground-based photometric surveys. Circular points denote radial velocity derived masses, triangular points masses from transit timing variations. Points without measured inclination (P2 sample) are gray. The gap is shown by a dashed line. Right: histogram for our P2 (green) and P1 (purple) samples taken on the gradient of the dashed line, limited to planets within 20M ⊕ . Bins are 3M ⊕ wide. Data points are shown with a random vertical distribution within the histogram bars.
semimajor axis a as the probability of transit decreases with 1/a and is limited by mission baseline. Most transiting planets in the sample were discovered via the Kepler mission, for which the detection efficiency falls for planets with periods longer than several hundred days, and radii smaller than ∼1-2R ⊕ for FGK dwarf stars (Thompson et al. 2018), well outside the range covered by the gap. For ground-based photometric surveys and to a lesser extent K2 the period and radius sensitivity is more relevant, with ground-based surveys in particular struggling to detect planets of Earth and Neptune size aside from a handful of cases with M dwarf host stars. For planets with RV-based detection or characterization, there is a bias in the strength of the Doppler signal arising from the stellar reflex motion, which goes as . From both transits and RVs there is then a bias against planets with long periods and low masses or radii. The hypothesized gap gradient has a different sign to this trend, and hence is caused by a different mechanism.
Our sample contains planets with masses measured through both RVs and TTVs. A key concern is whether the use of two methods is creating the appearance of two populations in the data. For the TTV characterized planets, mass measurements are only possible in multiple planetary systems. There is a further bias toward systems in or near mean motion resonances, as well as to larger planet masses, and larger, transiting, companion planets that are perturbed by the mass of the planet under question. We note that the majority of our TTV-derived masses are from the Kepler mission, where the data precision and short period of these systems lead to reliable mass determinations. For a full discussion of the nature of the TTV signal and its potential biases we refer the reader to the extensive body of literature (Hadden & Lithwick 2014;Steffen 2016;Mills & Mazeh 2017;Agol & Fabrycky 2018). Figure 1 shows the source of the mass measurement for each point, and Section 3.3 demonstrates that the gap arises in both the RV and TTV populations.
Additionally stellar activity, host star brightness, available telescope time, and its effects on observing cadence, plus the choice by various teams of particular planets to target for mass measurement, are all potential biases. Although these effects might lead to trends in the M p -P plane, we would not expect any of them to preferentially avoid planets in a few M ⊕ wide region near the line in Figure 1. The host star brightness is investigated in Section 4, and is a proxy for the likelihood of a target to be selected for characterization, as brighter host stars require shorter exposure times and are more scientifically interesting for further follow-up. The gap shows no dependence on host star brightness.

Gap Width and Gradient
Given the observational biases involved it is premature to characterize the width or depth of the gap in detail. We do however estimate the width and gradient here to help inform a theoretical understanding of the processes that could lead to such a gap. We use the P1 sample as the two populations are more clearly separated in Figure 1.
To estimate the gap gradient and width we employ a support vector machine (Cortes & Vapnik 1995) with a linear kernel, applied using the scikit-learn support vector classification tool (Pedregosa et al. 2011). A simple support vector machine finds the decision boundary separating two classes such that the classes are separated by as clean and wide a gap as possible. In our case we train the classifier by assigning points above the dashed line in Figure 1 to one class and those below to the other. After fitting, the resulting boundary has a gradient of −0.90M ⊕ day −1 and offset of 18.9M ⊕ . The smallest gap between two points in the P1 sample using this gradient is 4.0M ⊕ , an estimate of the gap width. Note that this analysis makes no conclusion as to the gap significance, which is explored below.

Hartigan's Dip Test
Hartigan's dip test (Hartigan & Hartigan 1985) is a test of multimodality in a sample population, which compares the empirical sample distribution to a generalized unimodal distribution. We apply the dip test to our P1 and P2 samples in Figure 2. We trial a range of gap gradients between 0 and −2M ⊕ day −1 , rotating the data into a frame parallel to the tested gradient each time, as the dip test operates in one dimension. We measure the significance of the results by generating 100,000 trials of uniformly distributed random data covering the same parameter space, rotated through the same angle, for each tested gradient.
To measure significance, we fix the gradient to −0.90M ⊕ day −1 , the value found in Section 3.2. For the P1 sample we find a p-value of 2.4×10 −4 , and for the P2, 3.6×10 −2 . We note that the gradient was extracted from the P1 sample, and so the gradient choice here is not optimal for the P2 sample, as seen in Figure 2.
As a test of the underlying biases, we extract the gap significance at the same gradient for several subsamples of the P1 set. The first consists of only planets detected by spacebased observatories, i.e., Kepler, K2, and TESS (58 planets). The p-value found is 9.2×10 −4 , demonstrating that any bias affecting ground-based surveys, such as window functions or airmass-induced systematics, does not generate the gap. We then consider planets with masses derived from TTVs only (30 planets), giving a p-value of 2.2×10 −2 , and planets with masses derived from RVs (42 planets), giving a p-value of 3.2×10 −4 . In all cases the gap remains significant.

Gaussian Mixture Models
As another test of gap significance we consider Gaussian mixture models (GMMs). We implement GMMs using the scikit-learn package (Pedregosa et al. 2011), with no priors on the component Gaussian locations or covariances. We incorporate measurement errors on the data by using 10,000 bootstrap samples, sampling each time from the normal distribution defined by each data point's value and error. For each trial, we fit independent GMMs with between one and five components. Each GMM is refit 100 times with random initial parameters, and the best-fitting result is taken and stored. For each trial we measure the BIC of the best-fitting model.
We consider the above test performed on the P2 and P1 samples. For both samples the results show strong support for a two-component model, with a change in BIC of --+ 19.9 5.8 5.5 over the 10,000 samples when moving from one to two components in the P2 sample, with none of the draws resulting in increased BIC. For the P1 sample the two-component model is also most favored with a change in BIC of --+ 12.7 5.8 5.2 , with 59 out of 10,000 draws giving increased BIC. Three-, four-, and fivecomponent models are rejected as the BIC increases each time. Figure 3 shows the distribution of changes in BIC (lower panel) and a contour of the weighted log probabilities of an example fitted two-component model (top panel).
Curiously the results are strongest for the P2 sample despite the two populations being more well separated in the P1 sample. This is due to the increased number of data points; this increase in significance with more data, despite the extra blurring factor introduced by unknown inclination, supports the physical reality of the gap. Figure 4 shows the dependency of the gap on various stellar and planetary parameters for ease of reference. No clear trends are seen. In particular, we note that the mass-period gap occurs at higher planetary masses to the photoevaporation valley of Fulton et al. (2017). While the photoevaporation valley is poorly constrained at these small semimajor axes, the valley would have to rise significantly as well as be associated with more mass loss than previously thought to lead to the gap observed here. We discuss a range of potential hypotheses and their likelihood to contribute to the gap formation below.

An Accretion/Ejection Boundary?
The fate of small bodies interacting with a planet depends on the ability of that planet to accrete or eject said bodies. As discussed by Wyatt et al. (2016) this balance is set by the ratio of the Keplerian velocity v Kep from the star at the planet's semimajor axis, and the escape velocity from the planet v esc . Expressed in terms of orbital period, this ratio is where the unit of M å is solar mass, period is in days, and density is in g cm −3 . Thus, the accretion/ejection criterion gives approximately the correct dependence of planet mass with period over the range where data exist, but with a normalization that is approximately 160× too high if the boundary lies in the gap. How might the accretion/ejection boundary apply here? There is no obvious answer, but the idea would need to be based on local growth. Perhaps the boundary does lie in the gap due to inefficient growth, and planets grow locally toward it but no higher. However, some of the most massive planets reach even higher masses by late pairwise collisions (e.g., Izidoro et al. 2017), which approximately double their masses, thus leaving a gap just above the boundary.

Multiplanet Stability
An alternative view of the observations is as a pileup of planets above the gap. If nearly every system contained a planet in the pileup region, dynamical instability could lead to the emergence of a gap in the distribution. For multiplanet systems, Gladman (1993) showed that a planetary system can only be stable if the orbital separation between two neighboring planets is larger than a critical separation, implying empty regions on either side of a hypothetical pileup.
While planet stability could in principle describe the existence of such a gap, another explanation is needed to explain why a pileup would form in the appropriate region of parameter space, and occur so uniformly across planetary systems. We give one potential pileup formation mechanism in the next section. Furthermore, this hypothesis would not explain why the seemingly single-planet systems are also not located in the gap, unless those systems contain as yet unseen planets.

Zero-torque Location
In young protoplanetary disks it is possible that planets become trapped in a "zero-torque location," i.e., a radial location where the sum of the torques acting on the planet cancel out such that the planet does not migrate. Such a scenario could explain the pileup of Section 4.2. These zerotorque locations move radially inward as the disk evolves over time (Lyra et al. 2010). It can be seen from Bitsch et al. (2015) and Morbidelli & Raymond (2016) that at late times in the evolution of a disk the zero-torque boundary follows a curve where essentially the planet mass falls off with radial location, as seen in our observational results, though the slope of this boundary is yet to be analyzed in detail and the simulations are limited to outer regions of the disk. Further study of other planet-disk processes including gap-opening and migration is needed to fully assess the importance of the disk phase.

Tides
The radial extent of the gap coincides entirely with the region where star-planet tides would affect both the spin and orbital evolution of planets. Hence, regardless of where the planet is formed and when and how it migrated, a planet entering into this region would effectively "activate" tides. Consequently, we speculate that the gap may be a natural result of spin-orbit tidal interactions, for which current understanding may be limited.
For example, as shown by , the constant phase-lag model (Goldreich 1966), while convenient, is physically and mathematically inconsistent. Further, the constant time-lag model (Hut 1981;Eggleton et al. 1998) has very limited applicability (Makarov 2015). Capture into spinorbit resonances and pseudosynchronous rotation states cannot be accurately realized with such models Noyelles et al. 2014). Instead, the direction of the net tidally induced migration is chaotic with respect to the multidimensional parameter space (Veras et al. 2019), and hence very well might produce gaps like the one seen here.

Conclusion
We have demonstrated the significance of a gap in the planet distribution for close-in Neptune-and terrestrial-mass planets. In the planet mass-orbital period plane, the gap cuts an approximately straight line from (20M ⊕ , 0 days) to (0M ⊕ , 20 days). Several mechanisms could be responsible for the gap formation, including tidal interactions with the host star, dynamical interactions with the disk, with other planets, or with accreting material. New discoveries from the TESS satellite among others will reveal this feature in more detail.