High-precision stellar limb-darkening in exoplanetary transits

Characterization of the atmospheres of transiting exoplanets relies on accurate measurements of the extent of the optically thick area of the planet at multiple wavelengths with a precision $\lesssim$100 parts per million (ppm). Next-generation instruments onboard the James Webb Space Telescope (JWST) are expected to achieve $\sim$10 ppm precision for several tens of targets. A similar precision can be obtained in modelling only if other astrophysical effects, including the stellar limb-darkening, are accounted for properly. In this paper, we explore the limits on precision due to the mathematical formulas currently adopted to approximate the stellar limb-darkening, and to the use of limb-darkening coefficients obtained either from stellar-atmosphere models or empirically. We propose a new limb-darkening law with two coefficients, `power-2', which outperforms other two-coefficient laws adopted in the literature in most cases, and particularly for cool stars. Empirical limb-darkening based on two-coefficient formulas can be significantly biased, even if the light-curve residuals are nearly photon-noise limited. We demonstrate an optimal strategy to fitting for the four-coefficients limb-darkening in the visible, using prior information on the exoplanet orbital parameters to break some of the degeneracies that otherwise would prevent the convergence of the fit. Infrared observations taken with the James Webb Space Telescope (JWST) will provide accurate measurements of the exoplanet orbital parameters with unprecedented precision, which can be used as priors to improve the stellar limb-darkening characterization, and therefore the inferred exoplanet parameters, from observations in the visible, such as those taken with Kepler/K2, JWST, other past and future instruments.


INTRODUCTION
Observations of transits offer the most accurate means of measuring exoplanet sizes and orbital inclinations, as well as mean stellar densities and, if combined with radial-velocity information, system masses. Transits are revealed through periodic drops in the stellar flux, due to the partial occultation of the stellar disk by the planet for a portion of its orbit. The amplitude of the flux decrement is primarily determined by the size of the planet relative to the star, but also depends on the location of the occulted area of the stellar disk and the wavelength observed, because of limb-darkening (the radial giuseppe.morello.11@ucl.ac.uk decrease in specific intensity). Inadequate treatment of limb-darkening may give rise to 10% errors in exoplanetary radii inferred from transits observed at UV or visible wavelengths, and accurate modeling is paramount in the study of the exoplanetary atmospheres, where differences of 10-100 parts per million (ppm) in transit depths at different wavelengths can be attributed to the wavelength-dependent optical depth of the external layers of the planet, rather than to stellar properties.
Stellar-atmosphere models are commonly used to predict the limb-darkening profiles, but empirical estimates are desirable, both to test the stellar models and to reduce potential biases in transit depths due to errors in the theoretical models or to other second-order effects, such as stellar activity, granulation, gravity darkening, etc. Other than for the Sun, the surface of which can be directly observed in great detail, techniques to map the stellar intensity distributions rely mainly on optical interferometry (e.g., Hestroffer 1997;Lane et al. 2001;Wittkowski et al. 2001;Aufdenberg et al. 2005) and microlensing (e.g., Witt 1995;Fields et al. 2003;Dominik 2004;Zub et al. 2011). The former is useful for only a very limited number of stars with large angular diameters, while the latter is limited by the low occurrence rate and non-repeatability of the microlensing events. Eclipsing binaries offer another route to mapping stellar surfaces, but accurate modeling of these systems is handicapped by complicating factors (gravity darkening, reflection effect, tidal distortion. . .), and a degree of redundancy between limb-darkening and radii. These issues are much reduced in most star+exoplanet systems, thanks to the smaller mass and size of the planetary companions (Wilson & Devinney 1971;Loeb & Gaudi 2003;Pfahl et al. 2008).
In this paper, we explore the potential biases in high-precision exoplanet spectroscopy using approximate stellar limb-darkening parameterizations, with coefficients obtained either from stellar-atmosphere models or empirically.
1.1. Structure of the paper Section 2 reviews the limb-darkening laws most commonly adopted in the exoplanet literature, the proposed power-2 law, and discusses the current approaches to obtain theoretical and empirical limb-darkening coefficients. Section 3 describes how we simulate light-curves from spherical-atmosphere models, and Section 4 reports the results of our analyses. In particular, Section 4.1 outlines the main differences between planeparallel and spherical stellar-atmosphere models; in Sec-tions 4.2 and 4.3 we analyze the precision with which different limb-darkening laws describe the intensity profile and the transit morphology, and derive the correct transit depth. Section 4.4 describes the equivalent analysis for the case of empirical limb-darkening coefficients, (i.e., allowed as free parameters in the light-curve fit).
Section 4.5 then focuses on the potential errors in 'narrow-band exoplanet spectroscopy' over short wavelength ranges, specifically in the context of Hubble Space Telescope (HST)/WFC3 observations. Section 5 examines the ability to fit a set of transit parameters and limb-darkening coefficients on transit light-curves, and develops an optimal strategy to maximize the accuracy in the estimated transit parameters and limb-darkening coefficients in the visible, if infrared observations are also available. Finally, Section 6 discusses the results of our analysis, with emphasis on the synergies between the James Webb Space Telescope (JWST) and Kepler, and on future surveys.

Limb-darkening parameterizations
In exoplanetary studies, the stellar limb-darkening profile is typically described by an analytical function I λ (µ), where I denotes the specific intensity, µ = cos θ, θ is the angle between the surface normal and the line of sight, and the λ subscript refers to the monochromatic wavelength or effective wavelength of the passband at which the specific intensities are given. For circular symmetry, µ = √ 1 − r 2 , where r is the projected radial co-ordinate normalized to a reference radius.
Numerous functional forms to approximate I λ (µ) have been proposed in the literature. In the study of exoplanetary transits, the most commonly used of these limbdarkening 'laws' are: 1. the quadratic law (Kopal 1950), 2. the square-root law (Díaz-Cordovés & Giménez 1992), 3. the four-coefficient law (Claret 2000), hereinafter referred to as "claret-4". The quadratic, square-root, and claret-4 laws rely on linear combinations of fixed powers of µ. In this paper, we advocate an alternative two-coefficient law incorporating an arbitrary power of µ which, to the best of our knowledge, has not previously been considered in the exoplanet literature (and which we initially constructed independently): 4. the 'power-2' law (Hestroffer 1997), We find that this form offers more flexibility and a better match to model-atmosphere limb-darkening than do other two-coefficient forms (Section 4.2). The claret-4 law can provide a more accurate approximation to model-atmosphere limb-darkening than other forms, but at the expense of using a larger number of coefficients. We note that the quadratic and square-root laws are subsets of the claret-4 prescription, with a 1 = a 3 = 0, a 2 = u 1 + 2u 2 , a 4 = −u 2 (quadratic) and a 3 = a 4 = 0 (square-root). The power-2 form is a subset only for α = 1 /2, 1, 3 /2, or 2.

Intensity distributions: plane-parallel vs. spherical
Theoretical limb-darkening coefficients can be obtained from stellar-atmosphere models, by fitting a parametric law (such as Equations 1-4) to detailed numerical evaluations of I λ (µ) using some suitable numerical technique -typically least squares, though detailed numerical results depend on both the method chosen and the data sampling (e.g., Heyrovský 2007;Claret 2008;. Tables of theoretical limb-darkening coefficients as a function of stellar parameters (usually the effective temperature, gravity, and metallicity) have been published by several authors for various photometric passbands. Most calculations are based on planeparallel atmosphere models (Claret 2000(Claret , 2004(Claret , 2008Sing 2010;Howarth 2011b;Claret et al. 2012Claret et al. , 2013Reeve & Howarth 2016), but some authors have considered spherical geometry, claiming, in some cases, noteworthy improvements compared to the use of planeparallel models (Claret & Bloemen 2011;Hayek et al. 2012;Neilson & Lester 2013a,b;Claret et al. 2014;Magic et al. 2015).
These spherical models show a characteristic steep drop-off in intensity at small, but finite µ (see, e.g., Figure 3). The explanation for this drop-off is straightforward (Figure 1).
In a plane-parallel atmosphere, the optical depth always reaches unity somewhere along the line of sight, even at grazing incidence. The limb of the star is, consequently, geometrically well-defined (and wavelength independent), and the intensity at the limb is comparable to the intensity at the center of the disk, to within a factor of a few.
In spherical geometry, in contrast, there are no constant angles µ at which the characteristic rays intersect the shells; instead they vary as a function of radius. For technical reasons the emergent intensities are usually specified as functions of the angle as measured at the outer boundary of the model atmosphere, which is originally set by the modeller at an arbitrary physical   Table 1. Symbols represent spherical-geometry, passband-integrated intensities for STIS/G430L (purple squares), STIS/G750L (blue "*"), WFC3/G141 (green "+"), IRAC/ch1 (yellow "x") and IRAC/ch4 (red dots); the corresponding plane-parallel intensities are shown as continuous lines of the same colors. The right-hand panels show the limb region (0.1≥ µ ≥0.0) at a larger scale and the µ angle corresponding to τRoss(r) = 1.
radius or reference optical depth, subject only to the condition that it has a suitably small opacity and emissivity even at the cores of strong lines. The outermost layer of the model atmosphere, corresponding to µ = 0 in this reference frame, is therefore always optically thin and does not correspond to what would be observed as 'the' stellar radius in investigations involving interferometric imaging, lunar occultation, or exoplanetary transits. Furthermore, the rapid changes in I λ (µ) that arise at small µ in spherical models are, inevitably, not well approximated by any of the standard parametric laws developed to represent results of plane-parallel models, because the intensity does not converge to zero at any given radius (see, e.g., Figure 1 of Neilson et al. 2017). Wittkowski et al. (2004) and Espinoza & Jordan (2015) therefore suggested to re-define µ = 0 to a radius outside of which almost no flux is observed, i.e. at the inflection point of the intensity profile so that the tail-like extension originating from the optically thin outer layers is excluded from fitting the limb-darkening profile. The radius is reasonably well defined by the point at which the gradient dI(µ)/dµ, or almost equivalently |dI(r)/dr|, reaches a maximum. Wittkowski et al. (2004) found that this radius corresponded closely to the  Top panel: Plane-parallel (black "+") and spherical (black line) model-atmosphere intensities vs. µ for the M5 V star in the WFC3/G141 passband. Parametric limb-darkening functions fitted to plane-parallel intensities (theoretical models) are quadratic (green), square-root (yellow), claret-4 (red), and power-2 (blue) laws. Bottom panel: large-scale plots of residuals of parametric limb-darkening laws for model-atmosphere intensities in the plane-parallel and spherical geometries (continuous and dashed lines, respectively). Rosseland radius, defined by a Rosseland optical depth along the normal τ Ross (r) = 1, for the M giant models they presented. Close to the limb, however, one can expect to observe significant emission as long as the lineof-sight optical depth is at least of order unity. Additionally, in the context of modeling exoplanetary transits, the projected radius for which the total line-of-sight optical depth within the observed wavelength band becomes one should be considered. In Wittkowski et al. (2004) the differences between radial and line-of-sight optical depths at a given physical depth were relatively modest because the giant-star atmosphere they modeled has a large radial extension with corresponding large angles µ = 0.3. Furthermore, they studied the emergent intensities in the K band, which has relatively smaller mean opacity than the Rosseland mean, thus partly cancelling the effect of the off-normal incidence.
In this paper, we test empirically the choice of stellar radius, based on the ratios between best-fit and input transit depths (see Section 4.1), finding different values for the best renormalization radius in different wavelength bands (corresponding to the different band-mean opacities). But these radii are larger throughout than the respective τ Ross (r) = 1, which correspond to µ = 0.0386, 0.049, and 0.0738 for the three models displayed in Figure 3, confirming the effect of the off-normal incidence on the emission near the limb.
Measuring empirical limb-darkening coefficients is important to test the validity of the stellar-atmosphere models and, if results are sufficiently accurate, to select the best theoretical models. Furthermore, fixing limb-darkening coefficients at incorrect theoretical values can significantly bias other fitted transit parameters, leading to incorrect inferences about planetary sizes and masses, or confusing the spectral signature of a planetary atmosphere (Csizmadia et al. 2013). In active stars, the presence of dark or bright spots on the surface can change the 'effective' limb-darkening coefficients relative to the unperturbed case, as well as the inferred stellar parameters adopted to compute the theoretical coefficients (Ballerini et al. 2012;Csizmadia et al. 2013). Other light-curve distortions may arise from gravity darkening in fast-rotating stars (von Zeipel 1924;Barnes 2009;Claret et al. 2012), stellar oscillations (Broomhall et al. 2009;Kjeldsen & Bedding 2011), granulation (Chiavassa et al. 2016), beaming (Zucker et al. 2007;Shporer et al. 2012), ellipsoidal variations (Pfahl et al. 2008;Welsh et al. 2010), reflected light (Borucki et al. 2009;Snellen et al. 2009), planetary thermal emission (Kipping & Tinetti 2010), or exomoons (Sartoretti & Schneider 1999;Kipping 2009,b). The photometric amplitudes of such distortions can be up to ∼100 ppm.

SIMULATED TRANSIT LIGHT-CURVES
In order to investigate the consequences of various approximations to limb-darkening, we calculated 'exact' synthetic transit photometry as a reference, using new model-atmosphere intensities coupled to an accurate numerical integration scheme for the light-curves.

Stellar models
We generated three representative model atmospheres using the Phoenix simulator (Allard et al. 2012); input parameters are summarized in Table 1. These models are intended to bracket the range in effective temperature of known exoplanet host stars, and embrace ∼98% of those listed in the exoplanet.eu database (Schneider et al. 2011 For each stellar model, I λ (µ) profiles were calculated in both plane-parallel and spherical geometries. In the former case, intensities were calculated at 96 values of µ, chosen as the anchor points for a Gaussianquadrature integration; the intervals ∆µ i = |µ i+1 − µ i | vary in the range 7 × 10 −4 -1.6 × 10 −2 and are smallest for µ ∼0 and ∼1. In spherical geometry, the µ values were determined by the properties of the model  Allard et al. (2012). The corresponding spectral types are based on the calibration reported in Gray & Corbally (2009). Table 2. Input transit parameters adopted in simulations Note-Transit parameters are similar to those of HD189733b: p = Rp/R * is the ratio of planet-to-star radii, aR = a/R * the orbital semimajor axis in units of the stellar radius, i the orbital inclination, e the eccentricity, b(= aR cos i) the impact parameter, and P the orbital period.
atmosphere, and the number of grid points is modeldependent (169-177 in the cases considered here); the limb is the most finely sampled region, down to ∆µ ∼ 6 × 10 −5 . Passband-integrated intensities were calculated for five instruments which have been widely used in the field of exoplanet spectroscopy, from the visible to mid-infrared wavelengths: the STIS/G430L, STIS/G750L and WFC3/G141 gratings onboard HST, 1 and the IRAC photometric channels 1 and 4 onboard Spitzer. The throughputs of these instruments are shown in Figure 2, and the corresponding plane-parallel and spherical-model intensities are shown in Figure 3.

Computing transit light-curves from spherical model atmospheres
We generated two sets of 'exact' transit light-curves for the exoplanet-system parameters reported in Table 2. Each set consists of fifteen transit light-curves, one for each stellar model and instrument passband, using the spherical-geometry intensities; the sets differ only in the impact parameter (or, equivalently, the orbital inclination). Each light-curve contains 2001 data points with 8.4 s sampling time, over a ∼4.7 hr interval centered on the mid-transit (the total duration of the transits is ∼2 hr, with the central transit being ∼10 min longer).
The orbital parameters determine z, the sky-projected star-planet separation in units of the stellar radius at any given time; for a circular orbit where a R is the semimajor axis in units of the stellar radius, P is the orbital period, i is the inclination relative to the sky, and t 0 is the time of conjunction. The fraction of stellar light occulted by the planet is, for a given intensity profile, a function F (p, z(t)), where p is the ratio of planet-to-star radii. Instead of using an analytical function (requiring a numerical approximation to the intensity profile), we computed the lightcurve by direct integration of the occulted stellar flux, using our purpose-built 'tlc' algorithm. The algorithm: 1. divides the sky-projected stellar disk into a userdefined number of annuli, n, with uniform radial separation, dr = 1/n; 2. evaluates the intensity at the central radius of each annulus, I(r i ), where r i = (0.5 + i)/n for i = 0 . . . n − 1, interpolating in µ from the input stellar-intensity profile (and r i = 1 − µ 2 i ); 3. evaluates the flux from each annulus, F i = I(r i ) × 2πr i dr, and hence the total stellar flux, 4. The occulted flux is then calculated as F occ (p, z) = n−1 i=0 F i f z,p (r i ), where f z,p (r i ) is the fraction of circumference of each annulus covered by the planet, given by Before calculating the actual transit light-curves from the spherical model intensities, we tested the accuracy of the algorithm using a wide range of parametric intensity profiles as input, with the same grid of µ values as the spherical models, comparing the resulting lightcurves to those from analytical calculations. We found that, with n = 100 000 annuli, the maximum differences between tlc and analytical light-curves were < 5 × 10 −7 in the worst-case scenarios -negligible compared to the minimum uncertainties that can be obtained with any current or forthcoming instrument.

MODELING TRANSIT LIGHT-CURVES
In empirical studies, it is generally convenient to analyse observed transit light-curves using parameterized models in order to fit for the unknown transit parameters and/or limb-darkening coefficients. To mimic this observational approach we employed pylightcurve, 2 our pipeline dedicated to the fast computation of model transit light-curves with a parametric limb-darkening profile. The power-2 parameterization (Equation 4) was implemented in the code for this work. Based on our proposal, the power-2 law has been implemented also in the batman code (Kreidberg 2015).

Plane-parallel vs. spherical limb-darkening models
As outlined in Section 2.2, discrepancies between the plane-parallel and spherical limb-darkening models are larger at smaller µ (i.e., closer to the stellar limb), solely because of the manner in which µ is defined, at least in the first step, for the spherical models. The spherical models present a steep drop-off in the normalized intensity I(µ)/I(1), approaching zero at some µ >0, while for the plane-parallel models the intensity is significantly greater than zero for all µ (see Figure 3).
It is reasonable to suppose that the 'photometric' stellar radius relevant to transit studies is better represented by the projected radius of the intensity drop-off than by µ = 0, the arbitrary uppermost layer in the atmospheric model. As a pragmatic approach, we assign this projected photometric radius to the point in the intensity distribution at which the gradient dI(µ)/dµ reaches a maximum (estimated as the mean µ value between the two consecutive µ values in the model with the maximum difference quotient, The corresponding radius, r 0 = 1 − µ 2 0 , hereinafter called the 'apparent' radius, is the ratio between the stellar photometric radius and the radius of the outermost layer in the model. Our approach is similar to what is suggested by Wittkowski et al. (2004) and Espinoza & Jordan (2015), but here we compute the photometric radius for each passband, while they calculate one wavelength-averaged photometric radius.
With this working definition, the best-fit model parameters for any of our simulated transit light-curves are expected to deviate from their input values accord- ing to: Table 3 reports the ranges of r 0 over the five instrument passbands for the given stellar model, the corresponding percentage variation in transit depth, p 2 = (R p /R * ) 2 , and the absolute variation evaluated at p input = 0.15. In the analytical approximations represented by Equa-tions 7-8, we find the apparent stellar radius to be systematically smaller than the radius of the uppermost layer of these models by 0.05-0.1% for the M dwarfs, and up to ∼0.2% for the F0-star model; the corresponding percentage errors in transit depths are about twice as large. For the case of a transiting hot Jupiter (p = 0.15), the discrepancies in transit depth are at the level of ∼20, 40, and 100 ppm for the two M dwarfs and for the F0 model, respectively. The discrepancies measured for the F0 model currently represent practical upper limits for exoplanet host stars, given that ∼99% of the current population is cooler, and hence have less extended atmo-  Table 3. Measured ranges of apparent stellar radius, r0, over five instrument passbands for the three stellar models; corresponding percentage variations in transit depth, p 2 = (Rp/R * ) 2 ; and absolute variations, evaluated at pinput = 0.15. spheres (for given log g). The wavelength-dependence of the apparent radius is negligible over the parameter space explored here, with a peak-to-peak amplitude of 11 ppm, in transit depth, from visible to mid-infrared wavelengths in the worst-case scenario (see Table 3).

Accuracy of the theoretical limb-darkening laws
We fitted the limb-darkening laws to the plane-parallel intensity profiles by adopting a simple least-squares method in the fits. We checked, both by using subsets of the precalculated intensity grids and interpolating at different angles, that similar results would be obtained using a uniform sampling in µ. Figure 4 shows the corresponding best-fit models, hereinafter referred as "theoretical" limb-darkening models, and their residuals, for the case of the M5 V observed in the WFC3 passband. The full list of models and the relevant residuals are reported in Figure A1 (Appendix A). The power-2 law (Equation 4) outperforms the other twocoefficient laws at describing the stellar limb-darkening of all stars observed at near to mid-infrared wavelengths with the HST /WFC3 and Spitzer /IRAC instruments; in some cases, the power-2 model outperforms even the corresponding claret-4 one. At visible wavelengths, the square-root and power-2 models have comparable success, while the claret-4 models fit best. The average errors in specific intensity predicted by the power-2 models are in the range 0.1-1.0%, with a maximum error up to ∼5-7% for the F0 V model in the visible passbands. The claret-4 models are more uniformly robust among all the configurations, with average errors in the range 0.05-0.6% and maximum errors <4%. The quadratic models are the least accurate of those tested, with average errors in the range 1-6% and maximum errors of up to 25% (for the M0 V model in the WFC3 passband).

Transit models with theoretical limb-darkening coefficients
We measured the potential biases in the model transit depths by fixing the limb-darkening coefficients at the theoretical values obtained from the plane-parallel stellar-atmosphere models and fitting the exact lightcurves described in Section 3.2. The free parameters in the fit were p, the ratio of planet-to-star radii, a R , the orbital semimajor axis in units of the stellar radius, and i, the orbital inclination. We used a Nelder-Mead minimization algorithm to find the values of these parameters which minimize the residuals between the model fits and the exact light-curves. We then carried out Markov-chain Monte-Carlo (MCMC) runs with 300,000 iterations to assess the robustness of the point estimates. Unlike previous investigations reported in the literature (e.g. Csizmadia et al. 2013), we seek to isolate the potential biases arising from the analysis method, and particularly the use of simplified geometry and a parameterization to characterize the stellar limb-darkening. No other astrophysical sources of error are considered in this study. Figure 5 illustrates the differences between the bestfit transit depths and input values for i = 90 • ; the expected values (from Equations 7-8) are also indicated. For all stellar models, the results are less dependent on the parametric law at longer wavelengths; this is to be expected, since the limb-darkening is smaller at longer wavelengths. In particular, the transit depths obtained at 8µm (IRAC channel 4) are all within 45 ppm of expected values, or within 13 ppm if adopting the power-2 or claret-4 coefficients. Overall, the transit depths obtained using the claret-4 coefficients deviate by less than ∼20 ppm from expected values, other than for the M5 V model in the visible passbands, where the discrepancy reaches 34 and 80 ppm for the STIS/G750L and STIS/G430L passbands. The peak-to-peak amplitudes in best-fit transit depths over the five passbands are 94, 28, and 8 ppm, going from the coolest to the hottest model. The results obtained with the power-2 coefficients are more robust for the cooler stars, and are within 44 ppm of expected values, except for the F0 V model in the visible passbands, where the inferred transit depths are 105 and 88 ppm larger for the  STIS/G750L and STIS/G430L passbands. The peakto-peak amplitudes in best-fit transit depths over the five passbands are 47, 44, and 102 ppm, again from the coolest to the hottest model. The quadratic-law coefficients have the largest scatter in the best-fit transit depth across the different passbands for all models, with peak-to-peak amplitudes of 250, 164, and 107 ppm.
Even though the true value of the transit depth is not known in a 'real-world' scenario, the presence of biases can be revealed by time-correlated noise in the lightcurve residuals. Figure 6 shows the residuals between the exact light-curve and the best-fit parametric model for the M5 V star in the WFC3 passband. The full list of light-curve residuals is reported in Figure A2. The amplitudes of the time-correlated residuals (maximum discrepancies from zero) are in the ranges 97-456, 8-105, and 11-75 ppm with quadratic, power-2, and claret-4 models respectively. Residuals at infrared wavelengths are typically smaller than in the visible, as expected. Neilson et al. (2017) report similar amplitudes for the residuals between the exact light-curves, computed with their CLIV stellar-atmosphere models, and parametric light-curve models. For comparison, residuals with ∼10 ppm root mean square (rms) amplitude have been obtained from the phase-folded Kepler photometry of several targets (e.g., Barnes et al. 2011;Müller et al. 2013), and ∼50-200 ppm rms amplitude is typically obtained for the white light-curves observed with the HST /WFC3 (e.g., Deming et al. 2013;Tsiaras et al. 2016,b).
It is possible that better results would be obtained if the limb-darkening coefficients were fitted adopting a different sampling in µ (e.g., uniform in r rather than in µ), a different method (e.g., imposing flux-conservation), and/or using spherical intensities Left panels: differences between the best-fit and input transit depths for the three stellar models, with b = 0 (red, full diamonds), and 0.5 (orange, empty diamonds), using empirical claret-4 limb-darkening coefficients; the expected values are denoted by black "+". Right panels: the same, but with empirical power-2 limb-darkening coefficients, b = 0 (blue, full circles), and 0.5 (cyan, empty circles). We repeated the fits to the exact light-curves with the limb-darkening coefficients as free parameters (in addition to p, a R , and i). The increased flexibility allows parametric models that better match the tran-sits, as shown by the smaller time-correlated residuals in Figures 6 and A2 (Appendix A). The residual amplitudes are in the range 10-63, 0-31, and 0-4 ppm with quadratic, power-2, and claret-4 models respectively. The time-correlated residuals due to imperfections in the transit models with empirical limb-darkening coefficients are hardly detectable with current instruments. Figure 7 shows the corresponding best-fit transit depths. Despite the very small light-curve residuals, the inferred transit depth can be significantly biased. The bias obtained with quadratic limb-darkening is roughly linear in the logarithm of wavelength for the two M dwarfs, ranging from 27-40 ppm at 8µm to 200-225 ppm at 0.4µm; there is no evident trend for the F0 V model, but the transit depth is again systematically over-estimated, by 17-52 ppm. The square-root and power-2 laws have similar performances, with deviations from the expected values smaller than 45 ppm, except for three 'bad' points: the STIS/G430L passband for the M5 V model, and the STIS/G430L and STIS/G750L passbands for the F0 V model, for which the transit depth estimates are smaller than the apparent values by 170-138, 207-205, and 74-73 ppm for the two laws. The transit depth estimates obtained when fitting the claret-4 coefficients are the most accurate, with deviations from the expected values being smaller than 30 ppm (largest in the STIS/G430L passband) and peak-to-peak amplitudes of 41, 36, and 24 ppm from the coolest to the hottest model. Figure 8 shows the empirical limb-darkening models and their residuals for the case of the M5 V observed in the WFC3 passband. The full list of models and the relevant residuals are reported in Figure A3 (Appendix A). It appears that the empirical limb-darkening models are better constrained at larger µ values, corresponding to the inner part of the disk. We also found that the empirical coefficients can be obtained from the stellar-atmosphere models if a uniform sampling in r rather than in µ is used. However, if a functional form is not able to reproduce the intensity profile, the empirical model will be particularly discrepant at the limb, causing larger biases in the best-fit transit depth compared to the case of theoretical coefficients with a uniform sampling in µ. Quadratic models, especially, always overpredict the intensities at the limb, so that an apparently larger planet would be needed to occult the extra stellar flux, in agreement with the larger transit depth estimates.

Inclined transits
For randomly orientated orbits, the inclinations i are distributed such that the probability density of cos i is uniform between 0 and 1. For circular orbits, therefore, the impact parameter, b = a R cos i, is uniformly distributed between 0 and a R , the semimajor axis in units of stellar radius. An exoplanet transits if and only if  Figure 11. Top panel: differences between the best-fit and the input transit depths for the M5 V model over twenty-five wavelength bins in the WFC3/G141 passband, with b = 0, using theoretical (orange, empty diamonds) and empirical (red, full diamonds) claret-4 coefficients. 0 ≤ b < 1 + p. We tested the ability to constrain the stellar limb-darkening profile and to measure the correct transit depth changes for the case b = 0.5. This study was conducted for the claret-4 and power-2 laws, as they led to more robust results than the other parameterizations. In this configuration, the area of the stellar disk with r < b − p = 0.35, or µ 0.94, is never occulted by the transiting exoplanet. Figure 9 shows the comparison between the transit depths estimated for the cases with b = 0 and 0.5, using the claret-4 and power-2 laws. In most cases, there are no significant differences in transit depth obtained for the cases with b = 0 and 0.5. The largest discrepancies (29-68 ppm) are registered for the three 'bad' points of the power-2 law, which are highlighted in Section 4.4.1. The empirical limb-darkening profiles are also very similar. Figure 10 shows the difference for the two most discrepant cases. The parametric models obtained from the transits with b = 0.5 approximate the intensities at the limb slightly better than those obtained from the transits with b = 0, but the bias is also significant. In general, it appears that, if a parametric law does not al-low a good approximation of the limb-darkening profile, the empirical model is optimized toward the center of the disk and significantly deviates at the limb (see also the quadratic fits in Figure 8). In these cases, inclined transits automatically attribute slightly higher weights to the intensities at the limb, as the planet spends more time occulting areas further from the center. Even if the innermost region of the stellar disk is not sampled during the transit, this is not often a problem, at least if b 0.5, because the intensity gradient does not vary significantly near the center. The examples discussed here may suggest that inclined transits can lead to smaller biases in the inferred transit parameters and limb-darkening profiles than edge-on transits, but the improvements appear to be quite small and, more importantly, the error bars have not yet been considered.

Narrow-band (WFC3-like) exoplanet spectroscopy
The results discussed in Section 4.4.1 may suggest that it is difficult to model the depth of a hot-Jupiter transit with an absolute precision better than ∼10-30 ppm, because of the intrinsic limitations of the stellar limbdarkening parameterizations (claret-4 being the most accurate among those currently used). This potential bias will be particularly important when analyzing visible to mid-infrared exoplanet spectra measured with the JWST, as it is comparable with the instrumental precision limit (Beichman et al. 2014).
Since the detection of the chemical species relies on their spectral features, it is not affected by a constant offset in transit depth; hence only the errors in transit depth differences at multiple wavelengths, referred to as relative error, are important. Here, we study the case of exoplanet spectroscopy with HST /WFC3, for which the narrow-band spectra reported in the literature often have ∼20-40 ppm error bars.
To fix the ideas, we considered 25 wavelength bins, identical to those ones adopted in Tsiaras et al. (2016b), to generate one set of exact light-curves (as in Section 3.2) for each stellar model, and calculate the theoretical limb-darkening coefficients. We modeled each exact light-curve using the two approaches outlined in Sections 4.3 and 4.4, i.e., with the theoretical and empirical limb-darkening coefficients, respectively. Figure 11 shows the spectra obtained by using the most accurate claret-4 law. The spectra calculated with the theoretical limb-darkening coefficients are offset by +1, +18, and −18 ppm, on average, in excellent agreement with the measured biases for the broadband lightcurves reported in Section 4.3. The relative errors, as measured by the peak-to-peak amplitudes, are 22, 14, and 8 ppm, from the coolest to the hottest model. The use of empirical limb-darkening coefficients reduces the spectral offsets, in these cases, to less than 3 ppm, and also reduces the peak-to-peak amplitudes down to 9, 5, and 4 ppm, respectively. Note that, in all configurations, the peak-to-peak amplitudes across the WFC3 narrow bands are smaller than the respective amplitudes for the broadband photometry from visible to mid-infrared reported in Section 4.3 and 4.4, and they also decrease with the increasing model temperature.
We remind the reader that the results discussed up to this section focus on the potential biases due to the approximate stellar limb-darkening parameterizations, in the absence of noise. The limits of the actual parameter fitting for light-curves with a low, but realistic, noise level are discussed in Section 5. An additional complication is the presence of temporal gaps in the transit lightcurves observed with instruments onboard the HST, and from satellites operating on low orbits in general. The impact of such gaps in the retrieval of the transit parameters will be discussed in a separate paper (Karpouzas et al. 2017, in preparation).

LIGHT-CURVE FITTING WITH EMPIRICAL
LIMB-DARKENING COEFFICIENTS Sections 4.3 and 4.4 discussed the intrinsic biases due to the use of parametric models with theoretical (planeparallel) or empirical limb-darkening coefficients. We now consider fitting the transit models with empirical limb-darkening coefficients to more realistic light-curves with noise. Those light-curves are obtained by adding gaussian time series to the exact light-curves (see Section 3.2). The standard deviation for the noise time series was set at ∼100 ppm, which is similar to the best photon-noise limit possible for a short-cadence Kepler frame (Van Cleve & Caldwell 2009), or for a single HST /WFC3 scan (Deming et al. 2013;Tsiaras et al. 2016b), taking into account the different integration times. We focused on four cases: two transits with b = 0 and 0.5 across the hottest model star, at 8 µm (F0 V   Figure 12 shows the transit depth posterior distributions for the b = 0 transit and F0 V model in the IRAC/ch4 passband, MCMC sampled with 1 500 000 iterations. Figure 13 reports the relevant chains. Similar parameter chains are computed, in parallel, for a R , i, the limb-darkening coefficients, a normalization factor, and the likelihood's variance. In two cases, the limb- darkening coefficients for the power-2 law are fitted, in the other two cases the claret-4 ones. The sampled posterior distributions of the transit depth, using the power-2 limb-darkening law, are almost identical and, in particular, the mean values and standard deviations differ by less than 1 ppm. Even considering subsets of the chains with 300 000 samples, the mean values and standard deviations are stable to better than 5 ppm. It may appear from Figure 12 that the results are biased, given that the peak of the posterior distribution is more than 1σ away from the expected value. However, by repeating the analysis with different noise realizations, the best-fit transit depth is within 1σ of the expected value in 7 cases out of 10, consistent with the expectations for gaussian noise (see Figure 14). Interestingly, the average of the individual best-fit transit depths is 9 ppm above the expected value, which is very close to the 7 ppm bias measured in the noiseless case (Section 4.4). The chains calculated with the claret-4 law present significant long-term modulations, resulting in wider posterior distributions; there are also moderately large differences between the two repetitions, indicating that they have not converged. The lack of convergence when fitting the claret-4 coefficients is not a surprise, and it is due to strong correlations between the coefficients and with the other transit parameters. Figures B4 and B5 (Appendix B) show the posterior distribution and chains obtained for the inclined transit, with b = 0.5, using the power-2 limb-darkening law. The posterior distributions are wider than those obtained for the edge-on transit, with 1σ ≈ 20 ppm rather than 12 ppm. The estimates from the partial chains with 300 000 samples are also less robust, with the corresponding mean values scattered over a 12 ppm interval.
The accuracy and precision of the empirical limb-darkening profiles are discussed in Appendix C.

M5 V model, STIS/G430L passband
We conducted corresponding studies for two transits in front of the M5 V model in the STIS/G430L passband. Figure 15 shows the transit depth posterior distributions for the edge-on transit, and Figure 16 reports the relevant chains. The sampled posterior distributions of the transit depth, using the power-2 limb-darkening law, are in good agreement and, in particular, the mean values differ by 10 ppm, with standard deviations of 45 and 48 ppm, respectively. As expected, the error bars are larger than those obtained for the less limb-darkened case (with identical noise). The estimates from the fractional chains with 300 000 samples may differ by up to 35 ppm, and the relevant standard deviations are in the range 34-51 ppm. Even in this case, the MCMC process failed to converge when fitting the claret-4 coefficients. Figure 17 reports the transit depth estimates obtained with 10 different noise realizations, using the power-2 law. Note that their average is significantly biased in the same direction as the bias obtained in absence of noise (see Section 4.4), and the 1-σ error bars are smaller than the bias. Figures B6 and B7 (Appendix B) show the posterior distribution and chains obtained for the inclined transit, with b = 0.5, using the power-2 limb-darkening law. The posterior distributions are wider than those obtained for the edge-on transit, i.e. 1σ ≈ 115 ppm rather than 45 ppm.
The accuracy and precision of the empirical limbdarkening profiles are discussed in Appendix C.

The benefits of using prior information
The examples discussed in Section 5.1 and 5.2 show that: passband, with 100 ppm gaussian noise (fitting p, aR, i, normalization factor, and limb-darkening coefficients). The histogram channels (blue and light-blue) are for two chains with 1 500 000 iterations, using the power-2 law; the channels are half-thick and shifted to improve their visualization. The red and orange lines denote the analogous posterior distributions when using the claret-4 law. The input and expected transit depths are also indicated (black vertical lines, continuous and dashed, respectively).  1. if using the power-2 law, the empirical limbdarkening coefficients can be fitted with the other transit parameters (p, a R , i; see Section 3.2) and a normalization factor, but the results can be significantly biased, depending on the stellar model and passband; 2. the analogous fits, using the claret-4 law, fail to converge (at least, using our MCMC routine with up to 1 500 000 iterations).
Unfortunately, all the two-coefficients laws are biased for some stellar types and wavelengths (see Section 4.3 and 4.4), but most of them are sufficiently accurate in the infrared wavelengths. Some authors proposed to fitting one or two coefficients of the claret-4 law, while keep-  ing the other coefficients fixed (e.g., Howarth & Morello 2017). We found that the validity of this approach relies on a good choice of the fixed coefficients, and thus is not fully empirical.
Our proposal is that, if the 'geometric' parameters, a R and i, are measured in the infrared, the results can be implemented as an informative prior when fitting at shorter wavelengths, thanks to their small or negligible wavelength-dependence, based on Equations 8, 9 and Table 3. We tested fitting for p, a R , i, the claret-4 limbdarkening coefficients, and a normalization factor, on the transit light-curves obtained for the M5 V model in the STIS/G430L passband, adopting gaussian priors on a R and i. The parameters of the gaussian priors are reported in Table 4. Figure 18 shows the transit depth posterior distributions for the edge-on and the inclined transits, obtained with 1 500 000 MCMC samples. Figure 19 shows the relevant chains. The use of gaussian priors on a R and i leads to convergence of the MCMC fits with claret-4 coefficients. The error bars in transit depth are significantly smaller than those estimated with power-2 limb-darkening coefficients and non-informative priors on a R and i, falling to ∼25 and ∼50 (compared with ∼45 and ∼115 ppm), for the edge-on and inclined transits, respectively. The biases, averaged over 10 lightcurves with different noise levels, are also smaller (+15 and -7 ppm; Figure 20).
As a final test, we investigated the effect of having a longer integration time, similar to that of the Kepler short-cadence frame. The longer integration time is simulated by binning the transit light-curves over 7 points (7 × 8.4 = 58.8 s). The relevant transit depth posterior distributions are shown in Figure 21; they are almost identical to the non-binned ones. We conclude that an integration time of ∼1 min, as for the Kepler short-cadence frames, does not affect the accuracy (error bar) of the fitted transit depth, compared to shorter integration times. 6. DISCUSSION 6.1. Synergies between JWST and Kepler, K2, TESS Empirical limb-darkening coefficients determined from exoplanetary transit light-curves are desirable, not only to validate the stellar-atmosphere models, but also to improve both the absolute and relative precision of inferred exoplanetary spectra. No two-coefficient limbdarkening law is accurate for all stellar types and/or wavelengths, but can still give near-perfect fits to the transit light-curves, albeit with significantly biased transit parameters and limb-darkening coefficients. To overcome this issue, fitting for the claret-4 limb-darkening coefficients is necessary, but some prior knowledge of the orbital parameters a R and i is required to enable convergence of the fitting algorithms.
Such knowledge can be obtained from the infrared observations, for which the effect of limb-darkening is smaller, and simple two-coefficient laws may be sufficiently accurate. The MIRI instrument onboard JWST will provide suitable observations for tens of exoplanets. A re-analysis of the Kepler and K2 targets, with the ap- The histogram channels (red and orange) are for two chains with 1 500 000 iterations, using claret-4 coefficients and gaussian priors on aR and i; the channels are half-thick and shifted to improve their visualization. Other lines denote the posterior distributions obtained with non-informative priors for all parameters and claret-4 (red and orange) or power-2 (blue and light-blue) limb-darkening coefficients. The input and expected transit depths are also indicated (black vertical lines, continuous and dashed, respectively). Bottom panel: analogous for the inclined transit (b = 0.5), except the red and orange lines report the histograms from the top panel for comparison.
proach developed in Section 5.3, can address some of the controversies reported in the literature (e.g. Southworth 2008;Claret 2009;Müller et al. 2013), if the only problem was the use of inadequate two-parameter limb-darkening laws. The same approach should be used for new observations that will be obtained, in the visible, by TESS and/or other JWST instruments.

CONCLUSIONS
We studied the potential biases in transit depth due to the use of theoretical stellar limb-darkening coefficients obtained from plane-parallel model atmospheres, and when fitting for empirical limb-darkening coefficients, over a range of model temperatures and instrumental passbands. We propose the use of a two-coefficient law, named "power-2", which outperforms the most com-mon two-coefficient laws adopted in the exoplanet literature, especially for the M-dwarf models. Nevertheless, the Claret four-coefficient law is significantly more robust than any simpler one, especially at visible wavelengths. Our results indicate that an absolute precision of 30 ppm can be achieved in the modeled transit depth at visible and infrared wavelegths, with 10 ppm relative precision over the HST /WFC3 passband, depending on the stellar type. The intrinsic bias due to the use of theoretical limb-darkening coefficients obtained from plane-parallel models is also 30 ppm for most exoplanet host stars (F-M spectral types), but this estimate does not take into account the uncertainties in the stellar models and in the measured stellar parameters, or the effect of stellar activity and other second-order effects.
Finally, we developed an optimal strategy to fitting for  the four-coefficient limb-darkening in the visible, using prior information on the exoplanet orbital parameters to break some of the degeneracies. This novel approach could solve some of the controversial results reported in the literature, which relies on empirical estimates of quadratic limb-darkening coefficients. The forthcoming JWST mission will provide accurate information on the orbital parameters of transiting exoplanets through observations performed by MIRI, enabling wide application of the approach developed in this paper.     Figure A3. Top panels: Plane-parallel (black '+') and spherical (black line) angular intensity vs. µ. Parametric limb-darkening with empirical limb-darkening coefficients fitted to the transit light-curves with b = 0, using quadratic (green), square-root (yellow), claret-4 (red), and power-2 (blue) law. Bottom panels: zoom of the residuals between parametric limb-darkening and spherical intensities (continuos lines), plane-parallel intensities (dashed lines).

B. SUPPLEMENTAL FIGURES: MCMC FITTING RESULTS FOR INCLINED TRANSITS
This Appendix contains Figures B4-B7 showing the histograms and chains for the inclined transits, analogous to those ones presented in Sections 5.1-5.2 for the edge-on transits (Figures 12-13 and 15-16 Posterior density star F0V (T eff = 7250, log g = 4.25), IRAC/ch4 power−2 (1) power−2 (2) input apparent noise = 100 ppm ∆t = 8.4 s b = 0.5 Figure B4. MCMC sampled posterior distributions of the transit depth for the inclined transit (b = 0.5) in front of the F0 V model, IRAC/ch4 passband, with 100 ppm gaussian noise; fitting p, aR, i, normalization factor, and power-2 limb-darkening coefficients. The histogram channels (blue and light-blue) are for two chains with 1 500 000 iterations; the channels are half-thick and shifted to improve their visualization. The input (black, continuous, vertical line) and expected (black, dashed, vertical line) transit depths are also indicated. Posterior density star M5V (T eff = 3084, log g = 5.25), STIS/G430L power−2 (1) power−2 (2) input expected noise = 100 ppm ∆t = 8.4 s b = 0.5 Figure B6. MCMC sampled posterior distributions of the transit depth for the inclined transit (b = 0.5) in front of the M5 V model, STIS/G430L passband, with 100 ppm gaussian noise; fitting p, aR, i, normalization factor and power-2 limb-darkening coefficients. The histogram channels (blue and light-blue) are for two chains with 1 500 000 iterations, using the power-2 law; the channels are half-thick and shifted to improve their visualization. The input (black, continuous, vertical line) and expected (black, dashed, vertical line) transit depths are also indicated.

C. ACCURACY AND PRECISION OF EMPIRICAL LIMB-DARKENING MODELS
In contrast to other transit parameters, the limb-darkening coefficients do not correspond directly to some physical property of the star-planet system. Also, for a given limb-darkening law, there exist sets of coefficients which are largely different but generate almost indistinguishable intensity profiles. Instead of studying their posterior distributions, it is more sensitive to calculate the chains of specific intensities at given µ values, then to compare, in the case of simulations, with the input limb-darkening profile. Figure C8 shows the residuals in specific intensities obtained from the two light-curves relative to the F0 V model in the IRAC/ch4 passband (Section 5.1), one edge-on (b = 0) and one inclined (b = 0.5) transit, using the power-2 law. The error bars (i.e., the standard deviations of the intensity chains) are smaller than 0.2% for µ > 0.4, then increase up to 1% near the edge of the disk. For the inclined transit, the error bars are larger by factors 1.0-2.4. The error bars of the predicted intensities along the steep drop-off, i.e. at µ 0.08, are not representative of the true errors, as the predictions may deviate from the input values by more than 10 σ.
We find that a good set of limb-darkening coefficients, which reproduces intensities close to those predicted by the intensity chains, can be obtained by taking medians of the coefficients chains. Figure C9 shows the intensity profiles estimated in this way, from light-curves with different noise realizations. They show, on average, the same bias obtained for the noiseless case (see Section 4.4). Figures C10-C11 report the analogous results obtained for the M5 V model in the STIS/G430L passband (Section 5.2). The error bars on the specific intensities are, in average, 1.5 times larger than those obtained for the less limb darkened cases. Even in this case, the bias is similar to the one obtained for the noiseless case (see Section 4.4).  µ Intensity mean chain coefficients − sph., power−2, ideal light−curve mean chain coefficients − sph., power−2, w. noise Figure C11. Residuals between power-2 empirical limb-darkening profile and the spherical intensities for the M5 V model, STIS/G430L passband, obtained from the edge-on transit without noise (blue, continuous line) and with 100 ppm gaussian noise (different noise time series, blue, dotted lines). Models are estimated by taking the mean chain values for the limbdarkening coefficients.