Measuring Transit Signal Recovery in the Kepler Pipeline. IV. Completeness of the DR25 Planet Candidate Catalog

, , , , , , , , , , , , , , and

Published 2020 September 10 © 2020. The American Astronomical Society. All rights reserved.
, , Citation Jessie L. Christiansen et al 2020 AJ 160 159 DOI 10.3847/1538-3881/abab0b

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

1538-3881/160/4/159

Abstract

In this work we empirically measure the detection efficiency of the Kepler pipeline used to create the final Kepler threshold crossing event and planet candidate catalogs, a necessary ingredient for occurrence-rate calculations using these lists. By injecting simulated signals into the calibrated pixel data and processing those pixels through the pipeline as normal, we quantify the detection probability of signals as a function of their signal strength and orbital period. In addition, we investigate the dependence of the detection efficiency on parameters of the target stars and their location in the Kepler field of view. We find that the end-of-mission version of the Kepler pipeline returns to a high overall detection efficiency, averaging a 90%–95% rate of detection for strong signals across a wide swathe variety of parameter space. We find a weak dependence of the detection efficiency on the number of transits contributing to the signal and the orbital period of the signal, and a stronger dependence on the stellar effective temperature and correlated noise properties. We also find a weak dependence of the detection efficiency on the position within the field of view. By restricting the Kepler stellar sample to stars with well-behaved correlated noise properties, we can define a set of stars with high detection efficiency for future occurrence-rate calculations.

Export citation and abstract BibTeX RIS

1. Introduction

The Data Release 25 (DR25) planet candidate catalog from the NASA Kepler mission (Thompson et al. 2018) represents the culmination of eight years' worth of analysis. The list of 4034 planet candidates was generated in a fully automated fashion from the full Kepler observation baseline of nearly four years. Using an automatic classification scheme called the Robovetter, the threshold crossing events (TCEs) generated by the Kepler data reduction pipeline were dispositioned as either planet candidates or false positives. This automation allowed, for the first time, an attempt to quantify the completeness (false-negative rate) and reliability (false-positive rate) of the catalog. These comprise two of the necessary ingredients for measuring the underlying planet occurrence rates from an observed list of planet candidates.

We have run a series of experiments characterizing the completeness of almost all of the recent versions of the Kepler pipeline, increasing in scope and complexity with each iteration. The work presented here represents the analysis of the fourth such experiment. Previous results can be found in Christiansen et al. (2016, SOC version 9.2, spanning the full observing baseline), Christiansen et al. (2015, SOC version 9.1, spanning one year of observations), and Christiansen et al. (2013, an early version of SOC version 8.3, spanning one month of observations). Each version of the pipeline has its own strengths and weaknesses, and the measured completeness of the accompanying planet candidate catalogs shows significant variations in each case. Therefore, it is crucial for studies of exoplanet demographics that the completeness model used to analyze a given planet candidate catalog be derived from the corresponding pipeline version. The work presented here and in Christiansen (2017) quantifies the completeness of the pipeline version used to generate the catalog published by Thompson et al. (2018); the corresponding quantification of the Robovetter is presented in Coughlin (2017). Christiansen et al. (2016) accompanies the catalog published by Coughlin et al. (2016), Christiansen et al. (2015) accompanies the catalog published by Mullally et al. (2015), and Christiansen et al. (2013) corresponds to an early version of the code used to produce the catalog published by Rowe et al. (2015). In all previous cases, there are important caveats in the careful usage and application of the measured completeness; readers should refer to the relevant citations for additional details. In addition, we note that this work does not quantify the false-positive rate nor false-alarm rate of the Kepler pipeline; these pieces must be calculated and included separately in occurrence-rate calculations.

In Christiansen (2017) and Thompson et al. (2018), we presented an initial analysis of the completeness of the DR25 planet candidate catalog to support calculations of ${\eta }_{\oplus }$, the frequency of Earth-like planets orbiting stars like the Sun. This early analysis was restricted to producing a single one-dimensional measure of the detection efficiency as a function of the multiple event statistic (MES) of the transit signal, that is, its signal strength, for all FGK dwarf stars in the Kepler target list. In this paper, we extend that analysis to investigate the completeness of the Kepler pipeline along several additional axes, for use in defining and supporting additional science use cases. The paper is organized as follows. In Section 2, we describe the generation of the DR25 planet candidate catalog. In Section 3, we describe the details and execution of this fourth transit injection experiment, and in Section 4 we explore the results. In Section 5 we discuss the implications, and in Section 6 we summarize the final results.

2. DR25 Planet Candidate Catalog Generation

The DR25 planet candidate catalog was produced by uniformly processing the full four-year data set (Quarters 1 through 17) through Science Operations Center (SOC) pipeline version 9.3 to produce a list of TCEs, described by Twicken et al. (2016). The TCEs were then evaluated by the Robovetter (Coughlin et al. 2016; Thompson et al. 2018) and dispositioned into planet candidates or false positives. In order to empirically recover the detection efficiency of the process—the likelihood a given planet signal will be correctly identified and dispositioned as a planet candidate—we can replicate the process with a suite of known "ground-truth" signals and analyze their outcomes. We summarize the pipeline and Robovetter processes in the Appendix, with particular emphasis on updates compared to the previous planet candidate catalogs.

3. Pixel-level Transit Injection

In order to characterize the pipeline detection efficiency, we have performed several distinct transit injection experiments. These largely fall into two categories: pixel-level transit injection (PLTI) and flux-level transit injection (FLTI). For PLTI experiments, simulated transit signals are injected into the calibrated pixels before the aperture photometry time series is constructed and cotrended. This allows the total detection efficiency loss to be determined through the photometric and search portions of the pipeline. However, PLTI is computationally expensive because it runs most of the pipeline modules. As a result, these PLTI experiments are limited to one injected planetary signal per target star, but include all available target stars. Hence, PLTI provides an average detection efficiency over a set of stars. Knowing that the stars are not all "average," a series of FLTI experiments were also conducted. For FLTI, the transit signal is injected into the cotrended flux time series within the Transiting Planet Search (TPS) module of the pipeline, and the signal-detection algorithm is performed over a restricted portion of the period search space focused on the period of the injected signal (Burke & Catanzarite 2017a). For "deep" FLTI experiments, we chose a small subset (∼100) of stars and performed ∼600,000 injection and recovery experiments for each star. For "shallow" FLTI experiments, we chose a larger subset (∼30,000) of stars and performed ∼2000 injection and recovery experiments for each star. These tests determined when and how individual stars can deviate from the average detection efficiency measured by PLTI. This document describes the PLTI experiment only; the FLTI products are documented separately in Burke & Catanzarite (2017a), and examples of using FLTI products to measure detection efficiency are discussed in Burke & Catanzarite (2017b).

For the PLTI experiment described here, we inject simulated transit and eclipse signals into the calibrated pixels of 190,128 targets covering the entire focal plane. These injections fell into three distinct categories, each designed for a specific use case: 146,295 targets were injected with planet-like signals at the target location on the CCD, thereby mimicking a planet orbiting the specified target; 33,978 targets were injected with planet-like signals at a location slightly offset from the nominal target location, thereby mimicking a blended eclipsing binary; and 9856 targets were injected with eclipsing-binary-like signals (having both primary and secondary eclipses) at the target location on the CCD. The latter two groups were generated to test the Robovetter's ability to discriminate between various kinds of false positives (for a detailed analysis, see Coughlin 2017) and are made available along with the first group for the community to test their own algorithms. The analysis described in this work will be restricted to the on-target planet-like signals in the first group and to the completeness of the Kepler pipeline, not the subsequent Robovetter stage. To generate the simulated transit signals, we use the DR25 Q1–Q17 stellar parameters provided by Mathur et al. (2017). An updated set of stellar parameters was released by Berger et al. (2018) during this analysis, but was not found to systematically change the conclusions.

For non-M-dwarf targets, each injected signal was generated as follows. First, the orbital period was drawn from a uniform range in period spanning 0.5–500 days. The planet radius was then chosen such that the resulting MES spanned the range 0–20, bracketing the pipeline transition from fully complete (100% signal recovery) to fully incomplete (0% recovery). To estimate the MES for a given injected signal, we take into account the stellar radius, orbital period, planet radius, an average combined differential photometric precision (rmsCDPP; Jenkins et al. 2010; Christiansen et al. 2012), the dilution of the signal by additional light in the photometric aperture, and the duty cycle of the observations (discarding gapped cadences and deweighted cadences with weights <0.5). We note that this process resulted in some unphysically large radii, but ultimately 50% of the injected planets have radii <2R and 90% have radii <40R. The orbital eccentricity was fixed at zero, and the impact parameter was drawn from a uniform range spanning 0–1. We also note that, in general, the MES that we estimate for the signal prior to its injection into the light curve will not equal the measured MES. We estimate the MES using a single rmsCDPP value, which is the average noise over the light curve; the actual data points into which the signal is injected may have higher or lower levels of local noise. Figure 1 shows the value of the measured MES compared to the estimated MES; the median value is 0.95, with a standard deviation of 0.15. The overall reduction in the measured MES is at least partly due to the average timing mismatch between the injected period, epoch, and transit duration and the finite grid of periods, epochs, and transit durations over which the pipeline searches (see, e.g., Jenkins et al. 1996, 2010).

Figure 1.

Figure 1. Histogram of the values of the measured MES of the successfully recovered injected signals compared to the estimated MES of the signal prior to injection.

Standard image High-resolution image

For the M-dwarf targets, where the habitable zone is much closer to the star, we concentrated the injected signals over a smaller period range. For the 3809 targets with 2400 K ≤ Teff ≤ 3900 K and log g ≥ 4, the orbital periods were selected from a uniform range spanning 0.5–100 days. Given the stellar radii, noise properties, and orbital periods, this resulted in a smaller injected planet radius distribution than the remainder of the targets, with 50% of the injected planets having radii <0.92R and 90% having radii <1.7R.

4. Results

The full table of parameters for the injected signals, and, if they were recovered, the parameters of the recovered signal, is available at the NASA Exoplanet Archive.9 All three types of injected signals are included (planet injected on-target, planet injected off-target, and eclipsing binary injected on-target); for the remainder of this section we focus solely on analysis of the on-target planet signal injections. Of the 146,295 planets, 45,281 were successfully recovered by the pipeline, as shown in Figure 2. We note here that many signals were injected below the signal detection threshold for the express purpose of exploring the transition region from detection to nondetection. For these purposes, "success" is defined by the ephemeris-matching algorithm described in Section 6.2 of Thompson et al. (2018). This incorporates both a period tolerance and a check on the number of transit events and allows for detections with periods that differ by half/double or a third/thrice the injected period. In the remainder of this section, we explore the detection efficiency of the pipeline with respect to several variables.

Figure 2.

Figure 2. Density distribution of simulated planet signals injected on-target, in planet radius and orbital period space (truncated at 7R). The injections are clustered around the pipeline signal-to-noise threshold, which moves to larger radii for longer periods, in order to examine the transition from detection to nondetection by the pipeline.

Standard image High-resolution image

4.1. Orbital Period

There is an expected drop in detection efficiency at longer periods that is due simply to the window function of the observations: beyond some period, meeting the minimum number of transits required for detection becomes less and less likely. For SOC version 9.3, there is an additional penalty applied during TPS to signals with only three contributing transits (see Section 9.4.4.1 of Jenkins et al. 2017). The top panel of Figure 3 shows the detection efficiency of the pipeline as a function of the number of transits contributing to the detection. For the 77,860 targets with at least four transits with durations shorter than 15 hr (the longest duration searched by the pipeline), we assess the fraction of injected signals that are recovered as a function of the number of contributing transits. As we have done in previous work (Christiansen et al. 2013, 2015, 2016), we analyze the detection efficiency in each bin as a function of the expected MES, fitting a Γ cumulative distribution function of the form

Equation (1)

normalized to the value (c) at MES = 15. The additional penalty for having only three transits is clear and has subsequently been incorporated into the generation of the window function (Burke & Catanzarite 2017c). For the period analysis performed here, we only consider injections with four or more transits. We find a remaining dependence of the detection efficiency on orbital period, shown in the middle panel of Figure 3. From 0 to 300 days, there is little change in the overall detection efficiency, barring a small drop in sensitivity in the 0–50 day bin that is due to the known behavior of the harmonic fitter removing signals at short periods (Christiansen et al. 2013, 2015). For periods longer than 300 days, the detection efficiency falls off slightly, from 95%–96% at the shorter periods to 87%–91% at the longest periods. This is an important caveat for those calculating occurrence rates in this interesting long-period parameter space. Given the dependence of the detection efficiency on orbital period for periods longer than 300 days, we recommend determining the detection efficiency over the period range of interest for a given occurrence rate calculation, rather than relying on the ensemble average detection efficiency, and paying particular attention to the window function.

Figure 3.

Figure 3. Upper: one-dimensional detection efficiency of the pipeline calculated in different numbers of transits (N) contributing to the MES. The curves are Γ cumulative distribution functions fit to the data. The binomial uncertainties on the data in each bin are shown with small horizontal offsets for clarity. The black dashed line shows the 7.1σ detection threshold of the pipeline, and the red dashed line shows the hypothetical perfect performance of the pipeline for pure white noise. Middle: as above, but calculated for different orbital period ranges. Lower: as above, but calculated for different ranges of stellar effective temperature.

Standard image High-resolution image

4.2. Stellar Properties

The Kepler pipeline measures the noise in a given time series using the combined differential photometric precision (CDPP; Christiansen et al. 2012). The CDPP is calculated at each data point in the time series for a set of 14 different trial transit durations, and it is equivalent to the effective white noise seen by a transit pulse of that duration. An average root-mean-square CDPP value is calculated for each time series and transit duration. For a time series that is dominated by white noise, we expect the rmsCDPP to decrease with increasing transit duration; that is, for a transit duration that is twice as long and integrating over twice as many data points, we expect the rmsCDPP to be reduced by a factor of $\sqrt{2}$. Therefore, we can use the change in the rmsCDPP as a function of increasing transit duration to track how well the noise for a given time series is approximated by white noise. We refer to this as the CDPP slope (see Section 3.1 of Burke & Catanzarite 2017a for details); it is calculated for short (2–4.5 hr) and long (7.5–15 hr) transit durations and is provided as part of the Kepler stellar parameters at the NASA Exoplanet Archive (Akeson et al. 2013). A simulated transit signal will reach the same signal-to-noise ratio (S/N) in two different light curves if they have the same CDPP. However, if they have different CDPP slopes, the ability of the pipeline to extract the transit signal from within the correlated noise is impacted. Burke & Catanzarite (2017a) found previously that the long CDPP slope (CDPPL) was the most useful discriminator, and the following analysis is based on CDPPL. CDPPL tracks the S/N of the stellar variability amplitude on rotation period and some pulsation period timescales. In order to visualize how CDPPL represents the noise in the data, we show in Figure 4 light curves of targets with negative CDPPL (left) and positive CDPPL (right). The stars in the right panel have increasing rmsCDPP values with increasing trial transit durations, due to the longer transit durations encompassing a higher amplitude of intrinsic stellar variability. On the other hand, the stars in the left panel have decreasing rmsCDPP values with increasing trial transit durations, as the noise bins down as expected for predominantly white noise.

Figure 4.

Figure 4. Left: Quarter 5 light curves of a selection of targets with CDPPL < −0.3. Right: the same for targets with CDPPL > 0.1. Note the difference in the y axes.

Standard image High-resolution image

Figure 5 shows the detection efficiency of the pipeline broken out by stellar effective temperature for stars with low (CDPPL < −0.2, left panel) and high (CDPPL > 0.0, right panel) levels of correlated noise. For this analysis, we have limited the stellar sample to stars with log g > 4.0, to injections with four or more transits (in order to remove the confounding factor of the detection efficiency on the number of transits discussed in Section 4.1), and with injected transit durations shorter than 15 hr. This leaves 91,672 targets with on-target simulated planet signals. Figure 5 shows that for stars with low levels of correlated noise, those with stellar effective temperatures between 4000 and 7000 K (roughly FGK stars) have a well-behaved detection efficiency. For cooler dwarfs, the detection efficiency drops off slightly, plateauing at 92% compared to ∼96% for the FGK stars. For hotter stars, the detection efficiency drops off somewhat more, although the number of recovered injections (166) is too low for a robust analysis. This decline in detection efficiency for non-FGK stars was also noted in earlier versions of the pipeline (Christiansen et al. 2015). We note that the decrease in detection efficiency for the larger stars is not related to their larger stellar radii relative to the injected planet radii, as this has been accounted for when injecting the planets and calculating the expected MES. We examined the effect of using the updated Gaia DR2 stellar parameters of Berger et al. (2018) here and found that the stellar temperature agreement between the two sets of parameters was very high: for the range of stellar properties examined here, Teff,Kep = 0.9998 ± 0.0027 × Teff,Gaia, where Teff,Kep and Teff,Gaia are the stellar effective temperature from the Kepler DR25 stellar catalog and Gaia DR2, respectively.

Figure 5.

Figure 5. As for Figure 3, but calculated for different ranges in stellar effective temperature. Left: stars with low CDPPL < −0.2. Right: stars with high CDPPL > 0.0.

Standard image High-resolution image

The picture is qualitatively similar but quantitatively lower for stars with high levels of correlated noise. The right panel of Figure 5 shows how the pipeline detection efficiency depends on stellar effective temperature for a sample of stars with positive CDPPL values. The 4000–7000 K targets still have the highest detection efficiency compared to the very cool and very hot stars, but the overall detection efficiency is reduced by the presence of the correlated noise. This is demonstrated further in Figure 6, which shows how the detection efficiency changes for a range of CDPPL values over the full sample. For the stars with lower correlated noise (negative CDPPL values), the behavior is as expected, plateauing at 97%. For the stars with higher levels of correlated noise (e.g., intrinsic stellar variability), the detection efficiency falls, plateauing at 90%.

Figure 6.

Figure 6. As for Figure 3, but calculated for different ranges in the long CDPP slope (CDPPL).

Standard image High-resolution image

As noted earlier, there is a correlation between the stellar temperature and the stellar noise properties. Cooler stars are more active, with more starspot and flaring activity. Figure 7 shows the distribution of CDPPL values as a function of stellar effective temperature. The two peaks in the lower panel around 3200 and 4100 K are artifacts of the available targets in those temperature regions, but it is clear that the bulk of the well-behaved (low correlated noise, negative CDPPL) values are found from 5000 to 6500 K. The light curves in the right panel of Figure 4 are relatively well-behaved targets across the temperature range, with CDPPL < −0.3, and the light curves in the left panel are targets with much higher levels of correlated noise, with CDPPL > 0.1, demonstrating the light curve morphology differences in these two populations. One way to select a stellar sample with a uniformly high detection efficiency is to restrict the selection to those targets with negative CDPPL values; this eliminates 27,450 of the original 146,295 targets.

Figure 7.

Figure 7. Upper: distribution of CDPPL values as a function of stellar effective temperature. The bulk of the Kepler targets are well-behaved solar-like stars. Lower: fraction of stars with positive CDPPL values as a function of stellar effective temperature. Stars cooler than ∼5000 K and hotter than ∼6500 K are more likely to have higher levels of correlated noise in their light curves.

Standard image High-resolution image

In addition, we examine the detection efficiency as a function of stellar magnitude. Figure 8 shows how the detection efficiency varies as a function of magnitude. It is somewhat lower for the saturated targets ( Kp < 12), which plateau around 91%–93%, and rising for the moderately bright targets (13 < Kp < 16), which plateau closer to 95%–97%, and dropping significantly for the small number of fainter targets (Kp > 16), reaching only 81%. The behavior at the bright end can be understood by using CDPPL as a measure of the correlated noise in the light curves. Saturated targets typically have larger apertures to capture the bleed from the electrons that overfill the well depth. These stars have the potential for both pointing-correlated changes in flux if the apertures do not adequately capture the flux, and a greater probability of capturing flux from nearby or background targets as the area of the aperture grows. Both of these effects increase the likelihood of correlated noise; the median CDPPL value for targets in the sample with Kp < 12 is 0.111, whereas the same for targets with Kp > 12 is −0.307. It is less clear why the faintest targets (Kp > 16) have significantly reduced detection efficiency, as they have a median CDPPL of −0.394, but there are a small number (4579) in the sample, and therefore they can be removed to select a sample with uniformly high detection efficiency.

Figure 8.

Figure 8. As for Figure 3, but calculated for different ranges in stellar magnitude.

Standard image High-resolution image

4.3. CCD Channel

Another variable to consider is the location of the target in the Kepler field of view. There are a number of CCD channels that have been identified as producing a higher number of spurious long-period TCEs that are due to image artifacts (Thompson et al. 2018). Section 6.7 of Van Cleve et al. (2009) has additional information on the source of the artifacts. Here we examine whether target location influences the detection efficiency—in particular whether the overabundance of long-period false positives on certain CCD channels reduces the detection efficiency of additional signals injected in the same light curve.

Since the field of view rotates every ∼90 days, a given group of targets (called a "sky group") will fall on four distinct CCD channels over the course of a year, symmetrically positioned around the center of the field of view. The number of the sky group and the number of the CCD channel upon which it falls are the same in Season 2, one of the four orientations of the spacecraft around the center of the field of view.

First we examine how the CDPPL value varies across the field, since Section 4.2 establishes the dependence of the detection efficiency on this value. The upper panel of Figure 9 shows the median CDPPL value across the field of view. The large black squares in the corners are the Fine Guidance Sensor CCDs, which were not used in the planet search. There are several notable features in the distribution of the median CDPPL value. First, there is an underlying trend for increasing (worsening) CDPPL values from the lower left to the upper right, which is correlated with decreasing Galactic latitude. With the relatively large Kepler pixels (4''/pixel), crowding becomes an increasing problem closer to the Galactic plane. This can increase the correlated noise measured across the CCD channel, due to the increased likelihood of additional light from, for example, background variable stars and eclipsing binaries. Four of the five channels with the highest (worst) CDPPL values are visited by the sky groups that fall on CCD channel 58 during one of the four observing orientations. Channel 58 is one of the channels most strongly affected by the image artifacts described earlier, which is manifested here in the way the median CDPPL value captures the correlated noise across the channel.

Figure 9.

Figure 9. Upper: the median CDPPL value for each sky group. A clear trend across the field of view is evident, tracing the Galactic latitude and worsening with higher crowding at lower Galactic latitudes. One particularly poorly performing set of sky groups that fall on CCD channel 58 during the year are labeled. Lower: the detection efficiency plateau c as a function of sky group for typically well-behaved targets. Nine out of 10 of the worst-performing channels are in the outer ring of modules; see text for additional detail.

Standard image High-resolution image

In the lower panel of Figure 9 we show the distribution of the plateau values (c from Equation (1)) across the field of view, for transit durations shorter than 15 hr, CDPPL < 0, and at least four transits. By restricting the injections to those shown previously to have the highest detection efficiency, we can examine any remaining influence of the CCD channels on the detection efficiency. There is a weak correlation in Figure 9 between the plateau value and the ring of best focus around the center of the focal plane (see Figure 17 of Van Cleve & Caldwell 2009): nine out of 10 channels with the lowest detection efficiencies (<95%) are in the outer ring of modules (a module being a set of four CCD channels under the same field flattening lens), including the worst-performing channel 11. The CCDs with the best focus fall in a symmetric ring with a radius of ∼2°–3° around the center of the focal plane (the diameter of the full focal plane is 10°). Therefore, any targets that fall on these CCDs will experience less contamination from nearby targets—contamination that could introduce correlated noise and reduce the detection efficiency. We also examined the intra-CCD variation in the plateau values; the standard deviation varies from 0.11 to 0.23, with the same dependence across the field of view as in the upper panel of Figure 9.

We have labeled some of the sky groups that fall on known problematic channels for at least one orientation of the spacecraft during the year. Channels 44, 58, and 62 show the largest numbers of spurious TCEs caused by image artifacts, but the sky groups that fall on these channels do not show significantly decreased detection efficiency compared to the remainder of the channels. We conclude that the detection efficiency is not degraded in these channels with many spurious long-period TCEs, which is discussed further below.

5. Discussion

The results presented here examine, extend, and quantify the previous indications that there are various parameters that influence the detection efficiency of the Kepler pipeline. For occurrence-rate calculations, this implies that with careful target selection, one could increase the completeness (lower the false-negative rate) of a planet candidate catalog. The reliability of such a catalog would need to be recalculated, which is beyond the scope of this paper. For the parameters examined here, we find that most of the detection efficiency differences (stellar temperature, stellar magnitude, position on the field of view) can be captured by the dependence on the correlated noise in the light curve as summarized by CDPPL. Therefore, one can construct a target sample with a high and well-characterized completeness by removing 37,804 targets with positive CDPPL values from the 198,709 targets searched to produce the final DR25 catalog.

One surprise was the similarity of the detection efficiency for CCDs with known correlated noise issues caused by image artifacts and of those without. As noted in Thompson et al. (2018), CCDs strongly affected by image artifacts produce a large overabundance of weakly detected TCEs with periods 300–500 days. A previous finding by Zink & Hansen (2019) had noted that the presence of multiple signals in the same light curve could degrade the detection efficiency. Therefore, it seemed likely that transit signals injected into targets on these CCDs may suffer reduced detection efficiency. However, on closer investigation, we find that this is not the case. The TPS module of the Kepler pipeline searches over a grid of period, epoch, and transit duration and finds the combination that produces the strongest detection as measured by the MES. For a light curve with multiple transiting signals, this will typically be the signal with the shortest orbital period. If this signal passes a series of vetoes, it is then removed from the light curve, which is then iteratively re-searched for additional signals that pass the MES detection threshold. This excision of data reduces the effective window function of the remaining data, and it also affects the behavior of the harmonic fitter that removes sinusoidal trends in the light curve. Both of these have the effect of decreasing the detection efficiency for additional signals in the light curve.

For this experiment, injected signals were generated uniformly in period between 0.5 and 500 days. The majority of these signals therefore have shorter periods than the spurious signals generated by the image artifacts, and therefore their detection efficiency seems to be largely unaffected by those longer-period image artifact signals, as the injected signals are detected first and removed. Analyzing the longer-period (>400 days) injections separately, we still see a similar detection efficiency between the channels strongly affected by image artifacts and those that are not. Therefore, the impact of removing long-period image artifact signals on the detection efficiency of additional long-period signals in the light curve seems minimal, which is due to the fact that removal of long-period image artifact signals from the light curve removes many fewer observation points than removal of shorter period signals, creating a much smaller reduction in the subsequent window function.

However, independent of the location on the field of view, we do show that the most complete planet candidate catalog is that which is confined to signals with periods <350 days. None of which is to say that one should not or cannot perform occurrence-rate calculations for targets or signals outside of the bounds enumerated in this work, but that one would have to calculate and apply the appropriate completeness correction for the desired sample. For the target sample defined above, with CDPPL < 0, the detection efficiency for injections with four or more transits, with injected transit durations shorter than 15 hr, and with orbital periods shorter than 350 days, is shown in Figure 10. The final best fit using Equation (1) is α = 33.54, β = 0.2478, and a plateau of c = 97.31%.

Figure 10.

Figure 10. As for Figure 3, but calculated for the set of stars with CDPPL ≤ 0, for injected signals with four or more transits, orbital periods <350 days, and transit durations <15 hr.

Standard image High-resolution image

As discussed in Section 3, because of limitations on available resources, the transit injection experiments were performed orthogonally: the wide-and-shallow PLTIs, with one injected signal per target, and the narrow-and-deep FLTIs, with many thousands of signals injected into a smaller number of targets. A limitation of the PLTIs described in this work is that they average over any fine structure in the response of the detection efficiency to features of the targets or the instrument, and they provide only a description of the ensemble behavior. The FLTIs were crucial, for instance, in identifying the role of the CDPP slope in discriminating between well-behaved and poorly behaved targets. However, due to the relatively small number of targets probed by the FLTIs, there may remain parameters to which the detection efficiency of the Kepler pipeline is sensitive that we have not yet identified. All of the light curves with simulated signals from the PLTI and FLTI experiments are available at the NASA Exoplanet Archive for further scrutiny by the community.10

6. Conclusions

This concludes the final Kepler project analysis of the detection efficiency of the Kepler pipeline. The performance of the SOC version 9.3 pipeline (Twicken et al. 2016) in producing the DR25 planet candidate catalog (Thompson et al. 2018) was found to be a return to high detection efficiency after a moderate decrease in SOC version 9.2 (Christiansen et al. 2016). The dependence of the detection efficiency on the properties of both the targets and instrument was explored in some detail, and the pipeline was found to have the highest detection efficiency for FGK stars (4000 ≤ Teff ≤ 7000 K) with well-behaved noise properties. CCD channels with poor focus were found to have decreased detection efficiency, as were signals with periods longer than 300 days.

The fact that the response of the Kepler pipeline varied so strongly in different target and instrument parameter spaces speaks to the importance of transit injection experiments like the work described here. Analogous studies using data from missions like K2 and TESS, which have the opportunity to extend the occurrence-rate results of Kepler beyond main-sequence FGK stars, will similarly need to quantify the completeness and reliability of their resulting planet candidate catalogs to facilitate the generation of robust occurrence rates.

We thank the anonymous referee for thoughtful comments and questions that improved the manuscript. Funding for the Kepler Discovery Mission is provided by NASA's Science Mission Directorate. These data products were generated by the Kepler Mission science pipeline through the efforts of the Kepler Science Operations Center and Science Office. This research has made use of the NASA Exoplanet Archive, which is operated by the California Institute of Technology, under contract with the National Aeronautics and Space Administration under the Exoplanet Exploration Program.

Facility: Kepler. -

Appendix

The end-of-mission version of the SOC pipeline has been described in considerable detail in Jenkins (2017) and chapters therein; see Figure 1 of that document for an overview. The code itself is also available online.11 Initially, the raw pixels are calibrated by the CAL module (Clarke et al. 2017), including corrections for bias, gain, nonlinearity, flat-field, and local detector electronics effects (overshoot and undershoot). There is also a correction for the smearing of the image that results from the fact that Kepler operates without a shutter. In version 9.3, the bias correction was updated from a static two-dimensional correction to a fully dynamic two-dimensional correction. This allowed the calibration to capture changes that are due to drifts in the bias values, such as those caused by temperature changes or crosstalk in the CCD electronics.

The calibrated pixels are then used to generate a simple aperture photometry time series in the PA module (Morris et al. 2017). Due to its extremely stable and precise pointing, the aperture photometry is generated by summing over whole, discrete pixels (as compared to fractions thereof). Prior to this version of the pipeline, the pixels chosen for inclusion in the photometric aperture were calculated by predicting the S/N of the flux contribution of the target star to each pixel by using a model of the CCD, the pixel response function, and the Kepler Input Catalog (Brown et al. 2011). In version 9.3, the procedure was updated to use the calibrated pixels themselves to calculate the S/N (Smith et al. 2017a).

The time series are then corrected for systematic noise components in the Presearch Data Conditioning (PDC) module (Smith et al. 2017b). In version 9.3, PDC was updated to include "spike" basis vectors that corrected for individual observations that triggered an inordinate number of spurious transit detections across multiple targets. In addition, the previous decomposition of the Bayesian maximum a posteriori (MAP) correction (Smith et al. 2012) into multiple timescales was extended to include the shortest (1.5 hr) timescale. As noted by Twicken et al. (2016), the improvements in version 9.3 to the generation and treatment of the time series decreased their noise, as measured by the combined differential photometric precision (Christiansen et al. 2012), by a few percent on average.

The corrected time series are then searched for periodic transit-like signals by the TPS module (Jenkins et al. 2017). TPS first prepares the time series, removing harmonic features and various flavors of outliers and then applying a wavelet-based matched filter to whiten the noise (i.e., equalize the noise contributions across frequencies). The time series is then searched for periodic signals with at least three events and a statistical significance as measured by the MES exceeding the 7.1σ threshold. For each identified TCE, TPS then performs a number of additional checks to examine the robustness and uniformity of the signal, and it vetoes signals that do not pass the checks.

There were several important updates to TPS in version 9.3. These included the following: (1) the number of harmonic components removed in each quarter was made a function of the length of the quarter, to reduce overfitting of signals in short quarters; (2) the whitening was performed quarter-by-quarter instead of on the time series as a whole, to compensate for discrete noise properties in each quarter; (3) the update to the whitening algorithm necessitated an update to the long (>2.5 day) gap-filling algorithm, using a sigmoid taper instead of a linear taper at the center of the gap, to avoid the artificially low noise properties that were occurring in the gaps; and (4) the rms noise calculations performed in the wavelet analysis were updated to use a nondecimated moving median absolute deviation (MAD), to more accurately represent the noise properties. All of these updates were designed to improve the sensitivity of the transit search.

Once a given time series is found to host a TCE that passes all the vetoes, it is passed to the Data Validation (DV) module (Twicken et al. 2018; Li et al. 2019). DV performs a transit fit for the first TCE using the Mandel & Agol (2002) prescription. The in-transit observations are then removed, and the subsequent time series is then sent back to TPS for additional scrutiny. This is repeated until the time series produces no more TCEs, the limit of the number of TCEs (10) is reached, or the time limit12 for DV to search a given time series is reached. DV then produces a suite of diagnostic tests and plots for each TCE.

The process by which TCEs were dispositioned into Kepler Objects of Interest and then into planet candidates or false positives evolved considerably over the course of the Kepler mission, from individual decisions made by eye, to team decisions made by multiple eyes, to team decisions made using a set of metrics, to a suite of algorithms dubbed the "Robovetter" automatically evaluating that set of metrics. Taking the people out of the process was the most important step toward quantifying the detection efficiency of the process and the prime motivation toward development of the Robovetter.

For DR25, the final list of TCEs and diagnostics is passed to the Robovetter. The details are provided in Coughlin et al. (2016) and updated in Thompson et al. (2018); see Figure 4 of the latter for a schematic overview. Table 3 of Thompson et al. (2018) provides the full suite of tests performed by the Robovetter, but in summary, in order for a TCE to be promoted to a planet candidate, it must satisfy a number of criteria, the most important of which are as follows:

  • 1.  
    Not have an ephemeris which matches that of a previously identified TCE in any light curve, including the light curve being analyzed;
  • 2.  
    Not have a secondary eclipse inconsistent with a planetary origin;
  • 3.  
    Not have statistically significant depth changes between the odd-numbered events and the even-numbered events (indicating an eclipsing binary system);
  • 4.  
    Not have a significantly V-shaped folded transit signal (also indicating an eclipsing binary system);
  • 5.  
    Have consistent depths for all measured transits (such that the folded signal strength is not dominated by one deeper event);
  • 6.  
    Be unique and statistically significant when compared to the correlated noise properties of the light curve when folded at the period of the TCE; and
  • 7.  
    Comprise at least three transits that have all individually passed a battery of additional tests interrogating their shape, coverage, and whether they fall during times that produce an inordinately high number of (likely spurious) signals.

Criteria 1 (identifying ephemeris matches) eliminates ∼0.05% of injected planets, criteria 2–5 (identifying stellar eclipses) eliminate ∼1.3% of injected planets, and criteria 6–7 (identifying non-transit-like events) eliminate ∼12% of injected planets; see Thompson et al. (2018) for additional details.

In the final mission-supported run of the Robovetter, the algorithms were tuned to maximize the reliability of the catalog for a given minimum completeness of shallow signals at long periods. This necessarily resulted in a sacrifice in the completeness of the catalog or the number of the true positives promoted by the Robovetter. The goal of the transit injection and recovery experiment described in this work is to quantify the fraction of true positives that are lost as a result of this fine-tuning. In DR25, the final run of the Robovetter resulted in 8054 TCEs being classified as Kepler Objects of Interest, and 4034 of those as planet candidates.

Footnotes

Please wait… references are loading.
10.3847/1538-3881/abab0b