Stellar convective penetration: parameterized theory and dynamical simulations

Most stars host convection zones in which heat is transported directly by fluid motion, but the behavior of convective boundaries is not well understood. Here we present 3D numerical simulations which exhibit penetration zones: regions where the entire luminosity \emph{could} be carried by radiation, but where the temperature gradient is approximately adiabatic and convection is present. To parameterize this effect, we define the"penetration parameter"$\mathcal{P}$ which compares how far the radiative gradient deviates from the adiabatic gradient on either side of the Schwarzschild convective boundary. Following Roxburgh (1989) and Zahn (1991), we construct an energy-based theoretical model in which $\mathcal{P}$ controls the extent of penetration. We test this theory using 3D numerical simulations which employ a simplified Boussinesq model of stellar convection. The convection is driven by internal heating and we use a height-dependent radiative conductivity; this allows us to separately specify $\mathcal{P}$ and the stiffness $\mathcal{S}$ of the radiative-convective boundary. We find significant convective penetration in all simulations. Our simple theory describes the simulations well. Penetration zones can take thousands of overturn times to develop, so long simulations or accelerated evolutionary techniques are required. In stars, we expect $\mathcal{P} \approx 1$ and in this regime our results suggest that convection zones may extend beyond the Schwarzschild boundary by up to $\sim$20-30% of a mixing length. We present a MESA stellar model of the Sun which employs our parameterization of convective penetration as a proof of concept. We discuss prospects for extending these results to more realistic stellar contexts.

Convection is a crucial mechanism for transporting heat in stars (Woosley et al. 2002;Hansen et al. 2004;Christensen-Dalsgaard 2021), and convective dynamics influence many poorly-understood stellar phenomena. For example, convection drives the magnetic dynamo of the Sun, leading to a whole host of emergent phenomena collectively known as solar activity (Brun & Browning 2017). Convection also mixes chemical elements in stars, Corresponding author: Evan H. Anders evan.anders@northwestern.edu which can modify observed surface abundances or inject additional fuel into their cores, thereby extending stellar lifetimes (Salaris & Cassisi 2017). Furthermore, convective motions excite waves, which can be observed and used to constrain the thermodynamic structure of stars (Aerts et al. 2010;Basu 2016). A complete and nuanced understanding of convection is therefore crucial for understanding stellar structure and evolution, and for connecting this understand to observations. Despite decades of study, robust parameterizations for the mechanisms broadly referred to as "convective overshoot" remain elusive, and improved parameterizations could resolve many discrepancies between observations and structure models. In the stellar structure literature, "convective overshoot" refers to any convectively-driven mixing which occurs beyond the boundaries of the Ledoux-unstable zone. This mixing can influence, for example, observed surface lithium abundances in the Sun and solar-type stars, which align poorly with theoretical predictions (Pinsonneault 1997;Carlos et al. 2019;Dumont et al. 2021). Furthermore, modern spectroscopic observations suggest a lower solar metallicity than previously thought, and models computed with modern metallicity estimates and opacity tables have shallower convection zones than helioseismic observations suggest (Basu & Antia 2004;Bahcall et al. 2005;Bergemann & Serenelli 2014;Vinyoles et al. 2017;Asplund et al. 2021); modeling and observational discrepancies can be reduced with additional mixing below the convective boundary (Christensen-Dalsgaard et al. 2011).
Beyond the Sun, overshooting in massive stars with convective cores must be finely tuned as a function of stellar mass, again pointing to missing physics in our current parameterizations (Claret & Torres 2018;Jermyn et al. 2018;Viani & Basu 2020;Martinet et al. 2021;Pedersen et al. 2021). Since core convective overshoot increases the reservoir of fuel available for nuclear fusion at each stage in stellar evolution, improved models of core convective boundary mixing could have profound impacts on the post-main sequence evolution and remnant formation of massive stars (Farmer et al. 2019;Higgins & Vink 2020).
In order to ensure that models can be evolved on fast (human) timescales, 1D stellar evolution codes rely on simple parameterizations of convection (e.g., mixing length theory, Böhm-Vitense 1958) and convective overshoot (Shaviv & Salpeter 1973;Maeder 1975;Herwig 2000;Paxton et al. 2011Paxton et al. , 2013Paxton et al. , 2018Paxton et al. , 2019. While some preliminary work has been done to couple 3D dynamical convective simulations with 1D stellar evolution codes (Jørgensen & Weiss 2019), these calculations are prohibitively expensive to perform at every timestep in a stellar evolution simulation. To resolve discrepancies between stellar evolution models and observations, a more complete and parameterizeable understanding of convective overshoot is required.
The broad category of "convective overshoot" in the stellar literature is an umbrella term for a few hydrodynamical processes (Zahn 1991;Brummell et al. 2002;Korre et al. 2019). Motions which extend beyond the convective boundary but do not adjust the thermodynamic profiles belong to a process called "convective overshoot" in the fluid dynamics literature. Convection zones can expand through a second process called "entrainment," through which motions erode composition gradients or modify the radiative gradient (Meakin & Arnett 2007;Viallet et al. 2013;Cristini et al. 2017;Jones et al. 2017;Fuentes & Cumming 2020;Horst et al. 2021). The primary focus of this work is a third process called "convective penetration". Convective penetration occurs when motions mix the entropy gradient towards the adiabatic in a region that is stable by the Schwarzschild criterion.
Convective overshoot, entrainment, and penetration have been studied in the laboratory and through numerical simulations for decades, and the state of the field has been regularly reviewed (e.g., Marcus et al. 1983;Zahn 1991;Browning et al. 2004;Rogers et al. 2006;Viallet et al. 2015;Korre et al. 2019). Experiments exhibiting extensive expansion of convection zones via entrainment have a long history (e.g., Musman 1968;Deardorff et al. 1969;Moore & Weiss 1973, and this process is often confusingly called "penetration"). Modern numerical experiments often examine the importance of the "stiffness" S of a radiative-convective interface. S compares the relative stability of a radiative zone and an adjacent convection zone according to some measure like a dynamical frequency or characteristic entropy gradient. Some recent studies in simplified Boussinesq setups exhibit stiffness-dependent convection zone expansion via entrainment (Couston et al. 2017;Toppaladoddi & Wettlaufer 2018); others find stiffness-dependent pure overshoot (Korre et al. 2019). A link between S and the processes of entrainment and overshoot has seemingly emerged, but a mechanism for penetration remains elusive.
Many studies in both Cartesian and spherical geometries have exhibited hints of penetrative convection. Some authors report clear mixing of the entropy gradient beyond the nominal convecting region (Hurlburt et al. 1994;Saikia et al. 2000;Brummell et al. 2002;Rogers & Glatzmaier 2005;Rogers et al. 2006;Kitiashvili et al. 2016), but it is often unclear how much mixing is due to changes in the location of the Schwarzschild boundary (entrainment) and how much is pure penetration. Other authors present simulations with dynamical or flux-based hints of penetration such as a negative convective flux or a radiative flux which exceeds the total system flux, but do not clearly report the value of the entropy gradient (Hurlburt et al. 1986;Singh et al. 1995;Browning et al. 2004;Pratt et al. 2017). Still other simulations show negligible penetration (e.g., Cai 2020; Higl et al. 2021). Even detailed studies which sought a relationship between penetration depth and stiffness S have presented contradictory results. Early work by e.g., Hurlburt et al. (1994) and Singh et al. (1995) hinted at a link between S and penetration length, at least for low values of S. Subsequent simulations by Brummell et al. (2002)  weak scaling of penetration depth with S; the authors interpret this scaling as a sign of pure overshoot and claim their simulations do not achieve adiabatic convective penetration. Still later simulations by Rogers & Glatzmaier (2005) demonstrate a negligible scaling of the penetration depth against S at moderate values of S. Prior simulations thus consistently show hints of penetration at low S (where results may not be relevant for stars, Couston et al. 2017), but present confusing and contradictory results at moderate-to-high S. There are hints in the literature that convective penetration may depend on energy fluxes. Roxburgh (1978Roxburgh ( , 1989Roxburgh ( , 1992Roxburgh ( , 1998 derived an "integral constraint" from the energy equation and found that a spatial integral of the flux puts an upper limit on the size of a theoretical penetrative region. Zahn (1991) theorized that convective penetration should depend only on how steeply the radiative temperature gradient varies at the convective boundary. Following Zahn (1991)'s work, Rempel (2004) derived a semianalytic model and suggested that inconsistencies seen in simulations of penetrative dynamics can be explained by the magnitude of the fluxes or luminosities driving the simulations. Indeed, some simulations have tested this idea, and found that penetration lengths depend strongly on the input flux (Singh et al. 1998;Käpylä et al. 2007;Tian et al. 2009;Hotta 2017;Käpylä 2019). Furthermore, in the limit of low stiffness, the simulations of Hurlburt et al. (1994) and Rogers & Glatzmaier (2005) may agree with Zahn's theory (although at high stiffness they disagree). In light of these results, and the possible importance of energy fluxes, Roxburgh's integral constraint and Zahn's theory deserve to be revisited.

Convective penetration & this study's findings
Convective penetration is the process by which convective motions extend beyond the Schwarzschild-stable boundary and mix the entropy gradient to be nearly adiabatic.
In this paper, we present simulations which exhibit convective penetration. This process is phenomenologically described in Sec. 2. In this work, the convection zone lies beneath an adjacent stable layer and convection penetrates upwards; our results equally apply to the reversed problem.
In order to understand this phenomenon, we derive theoretical predictions for the size of the penetrative zone based on the ideas of Roxburgh (1989) and Zahn (1991).
We find that the extent of convective penetration depends strongly on the shape and magnitude of the radiative gradient near the convective boundary.
Thus, the penetration length can be calculated using the radiative conductivity (or opacity) profile near the convective boundary. We present simulations of internally heated convection in which both the Schwarzschild boundary location and the extent of convective penetration depend primarily on the depth-dependent radiative conductivity.
We present these findings as follows. In Sec. 2, we present the central finding of this work: penetration zones in nonlinear convective simulations. In Sec. 3, we describe the equations used and derive a parameterized theory of convective penetration. In Sec. 4, we describe our simulation setup and parameters. In Sec. 5, we present the results of these simulations, with a particular focus on the height of the penetrative regions. In Sec. 6, we create and discuss a stellar model in MESA which has convective penetration. Finally, we discuss pathways for future work in Sec. 7.

CENTRAL RESULT: CONVECTIVE PENETRATION
In Fig. 1, we display a snapshot of dynamics in an evolved simulation which exhibits convective penetration. The simulation domain is a 3D Cartesian box, and this figure shows a vertical slice through the center of the domain. In the left panel, we display the vertical velocity. We see that convective motions extend beyond the Schwarzschild boundary of the convection zone, which is denoted by a horizontal dashed grey line. These motions stop at the top of a penetration zone, denoted by a solid horizontal line, where the temperature gradient departs from adiabatic towards the radiative gradient. In the right panel, we display temperature perturbations away from the time-evolving mean temperature profile. We see that warm upwellings in the Schwarzschild-unstable convection zone (below the dashed line) become cold upwellings in the penetration zone (above the dashed line), and these motions excite gravity waves in the stable radiative zone (above the solid line).
We further explore the simulation from Fig. 1 in Fig. 2 by displaying time-and horizontally-averaged 1D profiles of the temperature gradient ∇ (defined in Sec. 3). The adiabatic gradient ∇ ad (purple) has a constant value in the simulation. Also shown is the radiative gradient ∇ rad (orange). The domain exhibits a classical Schwarzshild-unstable convection zone (CZ) for z 1.04 where ∇ rad > ∇ ad ; the upper boundary of this region is denoted by a dashed vertical line. Above this point, ∇ rad < ∇ ad and the domain would be considered stable by the Schwarzschild criterion. However, the evolved convective dynamics in Fig. 1 have raised ∇ → ∇ ad in an extended penetration zone (PZ) which extends from 1.04 z 1.3. Above z 1.4, ∇ ≈ ∇ rad in a classical stable radiative zone (RZ). Between 1.3 z 1.4, there is a PZ-RZ boundary layer (referred to as the "thermal adjustment layer" in some prior studies) where convective motions give way to conductive transport and ∇ adjusts from ∇ ad to ∇ rad .
Our goals in this paper are to understand how these PZs form and to parameterize this effect so that it can be included in 1D stellar evolution calculations.
Horizontally-and temporally-averaged profiles of the thermodynamic gradients from the simulation in Fig. 1. We plot ∇ (green) compared to ∇ ad (purple, a constant) and ∇ rad (orange); note the extended penetration zone (PZ) where ∇ ≈ ∇ ad > ∇ rad . The dashed vertical line denotes the Schwarzschild boundary of the convection zone (CZ), the solid vertical line denotes the bottom of the radiative zone (RZ), and the greyed region denotes the PZ-RZ boundary layer.
In this section we derive a theoretical model of convective penetration by examining the energetics and energy fluxes in the Schwarzschild-unstable convection zone (CZ) and penetration zone (PZ). In Sec. 3.1, we describe our equations and problem setup and define the heat fluxes. In Sec. 3.2, we build a parameterized theory based on the kinetic energy (KE) equation. We find that imbalances in KE source terms within the CZ determine the extent of the PZ. By balancing the excess KE generation in the CZ with buoyant deceleration and dissipation work terms in the PZ, we are able to derive the size of the PZ. We find that a description of the size of a theoretical PZ does not depend on the often-considered stiffness, which measures the relative stability between the convection zone and an adjacent radiative zone.

Equations & flux definitions
Throughout this work, we will utilize a modified version of the incompressible Boussinesq equations, Here, the density is decomposed into a uniform, constant background ρ 0 with fluctuations ρ 1 which appear only in the buoyancy force and depend on the temperature T and the coefficient of thermal expansion α = ∂ ln ρ/∂T . We define the velocity vector u, the pressure p, the viscous diffusivity ν, the thermal diffusivity χ, the bulk internal heating Q, the adiabatic gradient ∇ ad , and a height-dependent thermal conductivity 1 k. We will consider Cartesian coordinates (x, y, z) with a constant vertical gravity g = −gẑ. Throughout this work, we will represent horizontal averages with bars ( · ) and fluctuations away from those averages with primes ( ). Thus, in Eqn. 3, T is the horizontally averaged temperature and T are fluctuations away from that; both of these fields evolve in time according to Eqn. 3. Assuming convection reaches a time-stationary state, the heat fluxes are found by horizontally-averaging then vertically integrating Eqn. 3 to find where F bot is the flux carried at the bottom of the domain, and F tot is the total flux, which can vary in height due to the heating Q. The mean temperature profile T carries the radiative flux F rad = −k∇T . We note that k and −∂ z T fully specify F rad and in turn the convective flux, F conv = F tot − F rad . We define the temperature gradient and radiative temperature gradient We have defined the ∇'s as positive quantities to align with stellar structure conventions and intuition. Marginal stability is achieved when ∇ = ∇ ad , which we take to be a constant. We note that the classical Schwarzschild boundary of the convection zone is the height z = L s at which ∇ rad = ∇ ad and F conv = 0. The addition of a nonzero ∇ ad to Eqn. 3 was derived by Spiegel & Veronis (1960) and utilized by e.g., Korre et al. (2019). In this work, we have decomposed the radiative diffusivity into a background portion (∇·F rad ) and a fluctuating portion (χ∇ 2 T ); by doing so, we have introduced a height-dependent ∇ rad to the equation set while preserving the diffusive behavior on fluctuations 1 In a star, χ ≡ k. We separate these values out of practicality, because simulations are well-resolved and numerically stable when k χ. The maximum vertical wavenumber of the T expansion is set by the stiffness (see Eqn. 27), not the radiative diffusivity k, so k can be small. On the other hand, an expansion of the turbulent fluctuations T must include the cutoff wavenumber of the turbulent cascade, which is set by χ. We separate k and χ in order to explore simulations with a wider range of penetrative behaviors (per Eqn. 15), as the theory presented here depends only on k. Note that as we increase the turbulence (the Reynolds number) in our simulations, we decrease χ, and χ → k. felt by classical Rayleigh-Bénard convection. Here, we will assume a model in which an unstable convection zone (∇ rad > ∇ ad ) sits below a stable radiative zone (∇ rad < ∇ ad ), but in this incompressible model where there is no density stratification to break the symmetry of upflows and downflows, precisely the same arguments can be applied to the inverted problem.

Kinetic energy & the dissipation-flux link
Taking a dot product of the velocity and Eqn. 2 reveals the kinetic energy equation, where we define the kinetic energy K ≡ |u| 2 /2, the fluxes of kinetic energy F ≡ [u(K + p/ρ 0 ) − νu × ω], the buoyant energy generation rate B ≡ |α|gwT , and the viscous dissipation rate Φ ≡ ν|ω| 2 where ω = ∇×u is the vorticity and |u| 2 = u · u & |ω| 2 = ω · ω. We next take a horizontal-and time-average of Eqn. 7 (we absorb the time-average into the horizontal-average · notation for simplicity). Assuming that K reaches a statistically stationary state, convective motions satisfy where F z is the z-component of F . Each profile in Eqn. 8 is shown in Fig. 3 for the simulation whose dynamics are displayed in Fig. 1. As in Fig. 2, the Schwarzschild CZ boundary is plotted as a dashed line, and the top of the PZ is plotted as a solid vertical line.
In the top panel, we display F z , neglecting the viscous flux term which is only nonzero in a small region above the bottom boundary. We see that F z is zero at the bottom boundary (left edge of plot) and at the top of the PZ. In the bottom panel, we plot B and Φ; we see that B changes sign at the Schwarzschild CZ boundary, and that Φ is positive-definite. At the boundaries of the convecting region, F z is zero (Fig. 3, upper panel). We integrate Eqn. 8 vertically between these zeros to find Integral constraints of this form are the basis for a broad range of analyses in Boussinesq convection (see e.g., Ahlers et al. 2009;Goluskin 2016) and were considered in the context of penetrative stellar convection by Roxburgh (1989). Eqn. 9 is the straightforward statement that work by buoyancy on large scales must be balanced by viscous dissipation on small scales. We break up the convecting region into a Schwarzschild-unstable "convection zone" (CZ) and an extended "penetration zone" (PZ); we assume that convective motions efficiently mix ∇ → ∇ ad in both the CZ and PZ. The buoyant energy generation is proportional to the convective flux, B = |α|gwT = |α|gF conv , and is positive in the CZ and negative in the PZ (see Fig. 3, bottom panel). Breaking up Eqn. 9, we see that Eqn. 10 is arranged so that the (positive) buoyant engine of convection is on the left-hand side, and the (positive) sinks of work are on the RHS. If viscous dissipation in the CZ does not balance the buoyant generation of energy in the CZ, the kinetic energy of the convective flows grows, resulting in a penetrative region. This region grows with time until Eqn. 10 is satisfied. We see that the viscous dissipation and buoyant deceleration felt by flows in the PZ determine its size. We now define the measurable fraction of the buoyant engine consumed by CZ dissipation. Eqn. 10 can then be rewritten as We will measure and report the values of f achieved in our simulations in this work. Eqn. 12 provides two limits on a hypothetical PZ: 1. In the limit that f → 0, viscous dissipation is inefficient. Reasonably if we also assume that PZ Φ dz → 0, Eqn. 12 states that the PZ must be so large that its negative buoyant work is equal in magnitude to the positive buoyant work of the CZ. This is the integral constraint on the maximum size of the PZ that Roxburgh (1989) derived.
2. In the limit that f → 1, viscous dissipation efficiently counteracts the buoyancy work in the CZ. Per Eqn. 12, the positive-definite PZ terms must approach zero and no PZ develops in this limit. This is mathematically equivalent to standard boundary-driven convection experiments.
In general, we anticipate from the results of e.g., Currie & Browning (2017) that f is closer to 1 than 0, but its precise value must be measured from simulations. Indeed, we find that f 0 but f < 1 in our simulations (see e.g., Fig. 3, bottom panel 2 ). Our simulations produce typical values of f ∼ 0.7.
Assuming that a PZ of height δ p develops above a CZ of depth L CZ , we model the PZ dissipation as Here Φ CZ is the volume-averaged dissipation rate in the CZ and ξ is a measurable parameter in [0, 1] that describes the shape of the dissipation profile as a function of height in the PZ. In words, we assume that Φ(z = L s ) ≈ Φ CZ at the CZ-PZ boundary and that Φ decreases with height in the PZ. The shape of Φ determines ξ; a linear falloff gives ξ = 1/2, a quadratic falloff gives ξ = 2/3, and ξ = 1 assumes no falloff. With this parameterization, and B ∝ F conv , we rewrite Eqn. 12, The fundamental result of this theory is Eqn. 14, which is a parameterized and generalized form of Roxburgh (1989)'s integral constraint. This equation is also reminiscent of Zahn (1991)'s theory, and says that the size of a PZ is set by the profile of ∇ rad near the convective boundary. A parameterization like Eqn. 14 can be implemented in stellar structure codes and used to find the extent of penetration zones under the specification of f and ξ. We note that an implementation of Eqn. 14 likely requires an iterative solve, as the penetration zone depth (δ p ) and thus the PZ integral of the flux, are not known a-priori. The parameters f and ξ are measurables which can be constrained by direct numerical simulations, and we will measure their values in this work.
In general, we expect that f and ξ should not change too drastically with other simulation parameters.
In order to derive a specific prediction for the PZ height, one must specify the vertical shape of F conv . We will study two cases in this work, laid out below. In both of these cases, we define a nondimensional "Penetration Parameter" whose magnitude is set by the ratio of the convective flux slightly above and below the Schwarzschild convective boundary L s (assuming ∇ = ∇ ad in the CZ and PZ), Since F conv < 0 in the PZ, the sign of P is positive. Intuitively, P describes which terms are important in Eqn. 12. When P 1, the buoyancy term dominates in the PZ and dissipation can be neglected there. When P 1, buoyancy is negligible and dissipation constrains the size of the PZ. When P ∼ 1, both terms matter. In this work, we have assumed that P and ξ are fully independent parameters. We make this choice because P can be determined directly from a known conductivity profile or stratification, whereas ξ is a measurable of evolved nonlinear convective dynamics. However, it is possible that there is an implicit relationship between these parameters (as P increases, so too does the extent of the PZ, which likely in turn modifies the value of ξ).

Case I: Discontinuous flux
We first consider a model which satisfies Here, F cz is a constant value of flux carried in the convection zone and P D is the penetration parameter (subscript D for discontinuous case). Plugging this functional form of the flux into Eqn. 14, and integrating the CZ over a depth L CZ below L s and the PZ over a height δ p above L s , we predict Assuming that f and ξ are weak functions of P D , we see that, for small P D , the size of the penetration region is linearly proportional to P D , but saturates as P D → ∞ due to dissipation. Intuitively, this result makes sense: as P D grows, the magnitude of F conv and the deceleration caused by buoyancy in the PZ shrink, resulting in larger penetrative regions (but this growth cannot extend indefinitely).

Case II: Piecewise linear flux
We next examine a model where the derivative of F conv (z) may be discontinuous at the CZ-PZ boundary, where (∂F rad /∂z)| CZ is a constant and P L is the penetration parameter (subscript L for linear case). When P L = 1, F conv is a linear profile that crosses through zero at z = L s . Solving Eqn. 14 with Eqn. 18 and integrating over L CZ in the CZ and δ p in the PZ, we retrieve a quadratic equation. This equation has two solution branches, only one of which corresponds to a positive value of δ p . On that branch, we find where ζ ≡ (ξf /2) P L /(1 − f ). We expect the penetration height to be proportional to √ P L for small values of P L , and to again saturate at large values of P L (as P L → ∞, so too ζ → ∞, and ( ζ 2 + 1 − ζ) → 0). In this work, we will test Eqn. 14 through the predictions of Eqns. 17 and 19. Our goals are to see if the predicted scalings with the penetration parameter P are realized in simulations, and to measure the values of f and ξ.

SIMULATION DETAILS
We will now describe a set of simulations that test the predictions in Sec. 3. While many simulations of convection interacting with radiative zones have been performed by previous authors, ours differ in two crucial ways. First, we construct our experiments so that P and S can be varied separately by driving convection with internal heating, thus avoiding strongly superadiabatic boundary layers where ∇ → ∇ rad . P is the "Penetration Parameter," defined in Eqn. 15, which compares the magnitude of the convective flux in the CZ and PZ; S is the "stiffness," defined in Eqn. 27, and compares the buoyancy frequency in the stable radiative zone to the convective frequency. We suspect that some past experiments have implicitly set P ≈ S −1 , which would result in negligible penetration for high stiffness (see discussion following Eqn. 27). Second, as we will show in Sec. 5, the development of penetrative zones is a slow process and many prior studies did not evolve simulations for long enough to see these regions grow and saturate.
Appealing to the Buckingham π theorem (Buckingham 1914), we count nine fundamental input parameters in Eqns. 1-4: ρ 0 , αg, L s , ν, χ, Q, ∇ ad , k CZ , and k RZ . There are four fundamental dimensions (mass, length, time, and temperature), and so we are left with five independent prognostic parameters in setting up our system. For two of these parameters, we will choose the freefall Reynolds number and the Prandtl number, which are analagous to the Rayleigh and Prandtl numbers in Rayleigh-Bénard convection. The remaining three parameters are S, P, and an additional parameter µ, which we will hold constant and which sets the ratio between ∇ rad and ∇ ad in the convection zone.
We nondimensionalize Eqns. 1-4 on the length scale of the Schwarzschild-unstable convection zone L s , the timescale of freefall across that convection zone and the temperature scale of the internal heating over that freefall time ∆T ; mass is nondimensionalized so that the freefall ram pressure ρ 0 (L s /τ ff ) 2 = 1, For convenience, here we define quantities with * (e.g., T * ) as being the "dimensionful" quantities of Eqns. 1-4. Henceforth, quantities without * (e.g., T ) are dimensionless. The dimensionless equations of motion are We construct a domain in the range z ∈ [0, L z ] and choose L z ≥ 2 so that the domain is at least twice as deep as the Schwarzschild-unstable convection zone. We decompose the temperature field into a timestationary initial background profile and fluctuations, T (x, y, z, t) = T 0 (z) + T 1 (x, y, z, t). T 0 is constructed with ∇ = ∇ ad for z ≤ L s , and ∇ = ∇ rad above z > L s . We impose a fixed-flux boundary at the bottom of the box (∂ z T 1 = 0 at z = 0) and a fixed temperature boundary at the top of the domain (T 1 = 0 at z = L z ). We generally impose impenetrable, no-slip boundary conditions at the top and bottom of the box so that u = 0 at z = [0, L z ]. For a select few simulations, we impose stress-free instead of no-slip boundary conditions (w = 0 We impose a constant internal heating which spans only part of the convection zone, The integrated flux through the system from heating is Throughout this work we choose Q mag = 1 and ∆ H = 0.2 so F H = 0.2. We offset this heating from the bottom boundary to z = 0.1 to avoid heating within the bottom impenetrable boundary layer where velocities go to zero and k is small; this prevents strong temperature gradients from establishing there. Furthermore, since the conductivity is not zero at the bottom boundary, the adiabatic temperature gradient there carries some flux. We specify the flux using and we choose µ = 10 −3 so that most of the flux in the convection zone is carried by the convection. Throughout this paper, we assume that the convection zone is roughly adiabatically stratified. We therefore define a dynamical measure of the stiffness, rather than one based on e.g., the superadiabaticity of ∇ rad in the convection zone. The average convective velocity depends on the magnitude of the convective flux, |u| ≈ F 1/3 H = (Q mag ∆ H ) 1/3 . The characteristic convective frequency is f conv = |u| /L s . Empirically we find that for our choice of parameters, |u| ≈ 1, so going forward we define f conv = 1. The stiffness is defined, where N 2 is the Brunt-Väisälä frequency in the radiative zone. In our nondimensionalization, N 2 = ∇ ad − ∇ rad in the radiative zone. We use S as a control parameter. In many prior studies, the stiffness has been set by the ratio of the subadiabaticity of ∇ rad in the RZ to the superadiabaticity of ∇ rad in the CZ, In those studies,S primarily describes the stratification of the initial state, but it also describes the stratifica-tion in superadiabatic boundary layers which drive convection. In this work, we maintain a nearly adiabatic convection zone without strongly superadiabatic regions by driving convection with an internal heating function which is offset from the lower boundary. Previous work has not defined P, but its definition in our current study should apply to previous studies, We note that P can be related to S andS, Our use of internal heating to decouple convective perturbations from ∇ rad in the CZ allows us to separately specify these nondimensional parameters. The distinction between S and P is perhaps clearer in the language of stellar evolution, where S is roughly the inverse square Mach number of the convection while P is set by the ratio of ∇ rad and ∇ ad . Aside from S, P, and µ, the two remaining control parameters R and Pr determine the properties of the turbulence. The value of R corresponds to the value of the Reynolds number Re = R|u|, and we will vary R. Astrophysical convection exists in the limit of Pr 1 (Garaud 2021); in this work we choose a modest value of Pr = 0.5 which slightly separates the thermal and viscous scales while still allowing us to achieve convection with large Reynolds and Péclet numbers.
We now describe the two types of simulations conducted in this work (Case I and Case II). We provide Fig. 4 to visualize the portion of the parameter space that we have studied. We denote two "landmark cases" using a purple box (Case I landmark) and an orange box (Case II landmark). These landmark cases will be mentioned throughout this work.

Case I: Discontinuous flux
Most of the simulations in this paper have a discontinuous convective flux at the Schwarzschild convective boundary. We achieve this by constructing a discontinuous radiative conductivity, where CZ refers to the convection zone and RZ refers to the radiative zone (some of which will be occupied by the penetrative zone PZ). Using S and P D as inputs and specifying the radiative flux at the bottom boundary and in the RZ defines this system, Eqns. 31 are found by solving the system of equations We study a sweep through each of the (P D , S, R) parameter spaces while holding all other parameters constant (see Fig. 4). We study an additional sweep through R parameter space using stress-free boundaries to compare to our no-slip cases. According to Eqn. 17, we expect δ p ∝ P D .

Case II: Piecewise linear flux
We also study simulations where the flux's gradient may be discontinuous at the Schwarzschild convective boundary. We achieve this by constructing a radiative conductivity with a piecewise discontinuous gradient, Since k varies with height, formally the values of S and P also vary with height; we specify their values at z = 2. By this choice, we require where ψ ≡ 1 + P L (1 + µ). We will study one sweep through P L space at fixed R and S (see Fig. 4). According to Eqn. 19, we expect δ p ∝ P 1/2 L . We arrive at Eqns. 33 by solving the system of equa-

Numerics
We time-evolve equations 22-24 using the Dedalus pseudospectral solver (Burns et al. 2020) 3 using timestepper SBDF2 (Wang & Ruuth 2008) and safety factor 0.35. All fields are represented as spectral expansions of n z Chebyshev coefficients in the vertical (z) direction and as (n x ,n y ) Fourier coefficients in the horizontal (x,y) directions; our domains are therefore horizontally periodic. We use a domain aspect ratio of two so that x ∈ [0, L x ] and y ∈ [0, L y ] with L x = L y = 2L z . To avoid aliasing errors, we use the 3/2-dealiasing rule in all directions. To start our simulations, we add random noise temperature perturbations with a magnitude of 10 −3 to a background temperature profile T ; we discuss the choice of T in appendix A. In some simulations we start with T = T 0 , described above, and in others we impose an established penetrative zone in the initial state T according to Eqn. A1.
Spectral methods with finite coefficient expansions cannot capture true discontinuities. In order to approximate discontinuous functions such as Eqns. 25, 30, and 32, we must use smooth transitions. We therefore define a smooth Heaviside step function, where erf is the error function. In the limit that d w → 0, this function behaves identically to the classical Heaviside function centered at z 0 . For Eqn. 25 and Eqn. 32, we use d w = 0.02; while for Eqn. 30 we use d w = 0.075. In all other cases, we use d w = 0.05. A table describing all of the simulations presented in this work can be found in Appendix C. We produce the figures in this paper using matplotlib (Hunter 2007; 3 we use commit efb13bd; the closest stable release to this commit is v2.2006. Caswell et al. 2021). All of the Python scripts used to run the simulations in this paper and to create the figures in this paper are publicly available in a git repository 4 , and in a Zenodo repository (Anders et al. 2021).

Penetration height measurements
In our evolved simulations, the penetrative region has a nearly adiabatic stratification ∇ ≈ ∇ ad . To characterize the height of the penetrative region, we measure how drastically ∇ has departed from ∇ ad . We define the difference between the adiabatic and radiative gradient, We measure penetration heights in terms of "departure points," or heights at which the realized temperature gradient ∇ has evolved away from the adiabatic ∇ ad by some fraction h < 1 of ∆. Specifically, In this work, we measure the 10% (δ 0.1 , h = 0.1), 50% (δ 0.5 , h = 0.5), and 90% (δ 0.9 , h = 0.9) departure points. Using Zahn (1991)'s terminology, δ 0.5 is the mean value of the top of the PZ while δ 0.9 − δ 0.1 represents the width of the PZ-RZ boundary layer. We find that these measurements based on the (slowly-evolving) thermodynamic profile provide a robust and straightforward measurement of penetration height (for a discussion of alternate measurement choices, see Pratt et al. 2017).

RESULTS
We now describe the results of the 3D dynamical simulations described in the previous section. Fig. 1 displays the dynamics in one of these simulations. While we will briefly examine dynamics here, our primary goal in this section is to quantitatively compare our simulations to the theory of Sec. 3 using temporally averaged measures.

Dynamics
In Fig. 5 we display snapshots of the temperature anomalies in the two "landmark" simulations denoted by boxes in Fig. 4. We display the temperature anomaly in the top panel of the Case I simulation with R = 400, P D = 4, and S = 10 3 ; this simulation is included in all three of our parameter space sweeps and represents the point where our (R, P, S) cuts converge in Fig. 4. We display the temperature anomaly in the bottom panel of the Case II simulation with R = 800, P L = 4, and S = 10 3 . The bulk Reynolds number in the convection zones of these simulations are (top) Re ∼ 250 and (bottom) Re ∼ 350. Thus, these simulations are less turbulent than the simulation in Fig. 1 (bulk Re ∼ 5000). Aside from the degree of turbulence, the dynamics are very similar in Figs. 1 & 5. In particular, we observe that relatively hot plumes in the CZ turn into relatively cold plumes in the PZ (as they cross the dashed horizontal lines), and relatively hot regions in the PZ lie above relatively cold regions in the CZ. Convective plumes extend through the penetrative region and impact the stable radiative zone (above the solid horizontal line). The convective motions excite waves at a shallow angle above the stiff radiative-convective boundary. We note that the Case II simulation has an additional temperature inversion at the base of the simulation. Case II simu-lations have a linearly increasing conductivity k in the convection zone, so there is formally a small penetrative region where ∇ ≈ ∇ ad > ∇ rad at the base of the domain below the internal heating layer (lower dotted line in bottom panel of Fig. 5). While the landmark simulations in Fig. 5 are not as turbulent as the dynamics in Fig. 1, they are sufficiently nonlinear to be interesting. Importantly, these simulations develop large penetration zones, and can be evolved for tens of thousands of convective overturn times. As we will demonstrate in the next section, the formation timescale of penetrative zones can take tens of thousands of convective overturn times.

Qualitative description of simulation evolution
In Fig. 6, we show the time evolution of the landmark Case I simulation (R = 400, S = 10 3 , and P D = 4) whose initial temperature profile sets ∇ = ∇ ad in the convection zone (z 1) and ∇ = ∇ rad in the radiative zone (z 1). In the top left panel, we display the height of the penetrative region δ 0.5 vs. time. This region initially grows quickly over hundreds of freefall times, but this evolution slows down; reaching the final equilibrium takes tens of thousands of freefall times. The evolution of the other parameters in our theory (f , ξ) are shown in the middle and bottom left panels of Fig. 6. We plot the rolling mean, averaged over 200 freefall time units. We see that the values of f and ξ reach their final values (f ≈ 0.67, ξ ≈ 0.58) faster than the penetration zone evolves to its full height. We quantify this fast evolution by plotting vertical lines in each of the left three panels corresponding to the first time at which the rolling average converges to within 1% of its equilibrated value. The equilibrated value is averaged over the final 1000 freefall times of the simulation and plotted as a grey horizontal line. The evolved value of f indicates that roughly 2/3 of the buoyancy driving is dissipated in the bulk CZ, so that 1/3 is available for PZ dissipation and negative buoyancy work. The evolved value of ξ indicates that the shape of dissipation in the PZ is slightly steeper than linear.
In the right panel of Fig. 6, we plot the profile of ∇/∇ ad in our simulation at regular time intervals, where the color of the profile corresponds to time, as in the left panels. ∇ ad is plotted as a dashed horizontal line while ∇ rad is plotted as a grey solid line which decreases with height around z ≈ 1 and satures to a constant above z 1.1. The location of the Schwarzschild boundary, L s , is overplotted as a black vertical dashed line. We note that the Schwarzschild boundary does not move over the course of our simulation, so the extention of the convection zone past this point is true penetration and not the result of entrainment-induced changes in the Schwarzschild (or Ledoux) convective boundaries. The traces of δ 0.1 and δ 0.9 are overplotted as red lines while that of δ 0.5 is plotted as a black line. We see that the fast initial evolution establishes a sizeable PZ (denoted by purple ∇ profiles), but its final equilibration takes much longer (indicated by the separation between the purple, green, and yellow profiles decreasing over time). This long evolution is computationally expensive; for this modest simulation (256x64 2 coefficients), this evolution takes roughly 24 days on 1024 cores for a total of ∼600,000 cpu-hours. It is not feasible to perform simulations of this length for a full parameter space study, and so we accelerate the evolution of most of the simulations in this work. To do so, we take advantage of the nearly monotonic nature of the evolution of δ p vs. time displayed in Fig. 6. We measure the instantaneous values of (δ 0.1 , δ 0.5 , δ 0.9 ), as well as their instantaneous time derivatives. Using these values, we take a large "time step" forward to evolve δ p . While doing so, we preserve the width of the transition from the PZ to the RZ, and we also adjust the solution so that ∇ = ∇ rad in the RZ, effectively equilibrating the RZ instantaneously. In other words, we reinitialize the simulation's temperature profile with a better guess at its evolved state based on its current dynamical evolution. For details on how this procedure is carried out, see Appendix A.

Dependence on P
We find that the height of the penetration zone is strongly dependent on P. In the upper two panels of Fig. 7, we plot the penetration height (δ 0.1 , δ 0.5 , δ 0.9 from Eq. 36) from Case I simulations (discontinuous k, upper left) and Case II simulations (discontinuous ∂ z k, upper right). The fixed values of R and S are shown above these panels. We find that the leading-order P scaling predictions of Eqns. 17 & 19 describe the data well at intermediate values of P (orange lines). At small values of P we see somewhat weaker scalings than these predictions, because the profiles of k and ∂ z k are not truly discontinuous but jump from one value in the CZ to another in the RZ over a finite width (see e.g., the ∇ rad profile in Figs. 2 & 6 and Sec. 4.3). At large values of P, the penetration height falls off of these predicted scaling laws. In this regime, dissipation dominates over buoyancy in the PZ, so the PZ height saturates.
The middle and bottom panels of Fig. 7 demonstrate that that f and ξ are to leading order constant with P. However, we find that f has slightly smaller values in the Case I simulations (left) than in the Case II simulations (right). We measure characteristic values of f ∈ [0.6, 0.9], signifying that 60-90% of the buoyant work is balanced by dissipation in the convection zone, depending on the simulation. We note a weak trend where f decreases as P increases. As P increases, we In the middle panels, we measure f according to Eqn. 11. We find values of f ∈ [0.6, 0.9], and changes in f are secondary to changes in P for determining penetration heights. In the bottom panels, we measure ξ according to Eqn. 13. We find characteristic values of ξ ∈ [0.5, 0.75], suggesting that the falloff of the Φ in the PZ is well described by a linear function (at high P when ξ ≈ 1/2), or by a cubic function (at low P when ξ ≈ 3/4).
find that CZ velocities decrease, leading to a decrease in the dissipation rate. When P is small, the PZ-RZ boundary (which acts like a wall, left panel of Fig. 1) efficiently deflects convective velocities sideways resulting in increased bulk-CZ velocities. As P grows, the velocities have access to an extended PZ in which to buoyantly decelerate before deflection, resulting in slightly lower bulk velocities. A similar trend of ξ decreasing as P increases can be seen. Recall that smaller values of ξ indicate the dissipative dynamics are rather different in the PZ and CZ. As the size of the PZ grows, the dynamical structures of the PZ shift from what is found in the CZ, and so ξ shrinks.

Dependence on S
We find that the height of the penetration zone is weakly dependent on S. In the left panel of Fig. 8, we plot the penetration height of a few Case I simulations with P D = 4 and R = 400 but with different values of S. The mean penetration height δ 0.5 varies only weakly with changing S, but that the values of δ 0.1 and δ 0.9 vary more strongly. The PZ-RZ boundary layer in which ∇ changes from ∇ ad to ∇ rad becomes narrower as S increases. To quantify this effect, we plot δ 0.9 − δ 0.1 in the righthand panel of Fig. 8. We find that the width of this region varies roughly according to a S −1/2 scaling law, reminiscent of the pure-overshoot law described by Korre et al. (2019).
Note that if the enstrophy, ω 2 in the convection zone exceeds the value of the square buoyancy frequency N 2 in the radiative zone, the gravity waves in the RZ become nonlinear. We therefore restrict the simulations in this study to relatively large 5 values of 10 2 ≤ S < 10 4 in order to ensure N 2 > ω 2 even in our highest enstrophy simulations.

Dependence on R
We find that the height of the penetration zone is weakly dependent on R. In the upper left panel of Fig. 9, we find a logarithmic decrease in the penetration height with the Reynolds number. In order to understand how this could happen at fixed P, we also plot the output values of f (upper middle) and ξ (upper right). We find that f increases with increasing R, but is perhaps leveling off as R becomes large. We find that ξ does not increase strongly with R except for in the case of laminar simulations with R < 200. Eqn. 17 predicts that δ p should change at fixed P and ξ if f is changing. In the bottom left panel, we show that the change in δ p is due to this change in f . We find that this is true both for simulations with stress-free dynamical boundary conditions (open symbols, SF) and for no-slip conditions (closed symbols, NS).
We now examine why f increases as R increases. In the SF simulations, within the CZ, we can reasonably approximate Φ as a constant Φ CZ in the bulk and zero within the viscous boundary layer, where ν is the viscous boundary layer depth. We have visualized a NS dissipation profile in the bottom panel of Fig. 3; SF simulations look similar in the bulk, but 5 These values are large for nonlinear simulations, but modest compared to astrophysical values. While there is observational uncertainty about the magnitude of deep convective velocities in the Sun, in the MESA model presented in Sct. 6, fconv ≈ 10 −6 s −1 and N ≈ 10 −3 s −1 , so S ≈ 10 6 .
drop towards zero at the bottom boundary rather than reaching a maximum. Then, we have and so per Eqn. 11, where f ∞ is the expected value of f at R = ∞ when ν = 0. So we see that the CZ dissipation and therefore f vary linearly with ν . In the bottom middle panel of Fig. 9, we find that Eqn. 39 with f ∞ = 0.755 captures the high-R behavior. To measure ν , we first measure the height of the extremum of the viscous portion of the kinetic energy flux F near the boundary, and take ν to be the twice that height. We find that Eqn. 39 is a slightly better description for the SF simulations than the NS simulations; NS simulations have maximized dissipation in the boundary layer, and therefore Eqn. 37 is a poor model for z ≤ ν . In the bottom right panel of Fig. 9, we demonstrate that the depth of the viscous boundary layer follows classical scaling laws from Rayleigh-Bénard convection 6 (Ahlers et al. 2009;Goluskin 2016). Combining these trends, we expect left panel. We need to multiply this equation by a factor of 0.9, which accounts for some differences between the simulations and the idealized "discontinuous flux" theoretical model. First, due to internal heating and the finite width of the conductivity transition around the Schwarzschild boundary, the convective flux is not truly constant through the full depth of the CZ. Thus, we expect L CZ in Eqn. 17 to be smaller than 1. Furthermore, the theory is derived in the limit of an instantaneous transition from ∇ ad to ∇ rad where δ 0.1 = δ 0.5 = δ 0.9 ; our simulations have a finite transition width. Despite these subtle differences, we find good agreement. Using f ∞ = 0.755 we estimate that δ 0.5 ≈ 0.31 for R → ∞ and plot this as a horizontal orange line on the upper left panel of Fig. 9. This value is coincidentally very near the value of δ 0.5 achieved in our highest-R simulations. Unfortunately, we cannot probe more turbulent simulations. We can only run the R = 6.4×10 3 simulation for a few hundred freefall times. Our accuracy in measuring results from this simulation is limited by the long evolutionary timescales of the simulation (see Fig. 6 for similar evolution in a less turbulent, R = 400 case). Even accounting for our accelerated evolutionary procedure, we can only be confident that the PZ heights of this simulation are converged to within a few percent.
Future work should aim to better understand the trend of PZ height with turbulence. However, the displayed relationships between δ p and f , f and ν , and ν and R -all of which are effects we largely understandsuggest that PZ heights should saturate at high R.
In summary, we find that δ p decreases as R increases. We find that these changes are caused by increases in f . In our simulations, f seems to have a linear relationship with the size of the viscous boundary layer ν . By measuring f and ν in a simulation, the value of f ∞ can be found from Eqn. 39. Stellar convection zones are not adjacent to hard walls 7 , so f ∞ and the limit ν → 0 applies to stellar convection.
While we have examined a Case I simulation with P = 4 here, we expect the simulation with P L = 1 (a linear radiative conductivity profile) to be the most representative of conditions near a stellar convective boundary. In this simulation, we measure ξ ≈ 0.6, f ≈ 0.785, ν ≈ 0.08, and L s = 1. Using Eqn. 39, we estimate that are good first estimates for f and ξ when applying our theory of penetrative convection to stellar models.

TESTING OUR PARAMETERIZATION IN A SIMPLE STELLAR MODEL OF THE SUN
Our simulation results present a strong case for a fluxand dissipation-based model of convective penetration, similar to those considered by Zahn (1991) and Roxburgh (1989). In this section, we discuss a simple stellar model of the Sun which we have created by implementing our parameterization into MESA (see Appendix B). We of course note that the theory and 3D simulations in this work do not include many of the complications of stellar convection like density stratification, sphericity, rotation, magnetism, etc. We present this model as a proof of concept and to inspire further work.
In order to implement our theory into MESA, we need to extend Eqn. 14 to spherical geometry. To do so, we replace horizontal averages in Eqn. 9 with integrals over latitude and longitude, and find that the relevant integral constraint contains the convective luminosity, where L conv = 4πρ 0 r 2 F conv , r is the radial coordinate, and we write the RHS as a volume integral. We next define f in the same way as in Eqn. 11 and define ξ similarly to Eqn. 13, where V PZ and V CZ are the volumes of the PZ and CZ respectively. Eqn. 43 generalizes Eqn. 13 outside of the assumption of a plane-parallel atmosphere. Thus Eqn. 14 in spherical geometry is We implemented Eqn. 44 in MESA (see Appendix B for details) and evolved a 1M model to an age of 4.56 Gyr with f = 0.86 and ξ = 0.6 (Eqn. 41) to qualitatively understand how our penetration parameterization modifies a stellar model. In the top panel of Fig. 10 we display ∇ ≡ d ln T /d ln P from the model which includes convective penetration. Note that ∇ (green) remains close to ∇ ad (purple) below the Schwarzschild convective boundary (∇ ad = ∇ rad ) in a penetration zone. After some depth ∇ → ∇ rad (orange) in the star's interior. We additionally evolved a standard 1 M MESA model to a 4.56 Gyr age without the inclusion of a PZ. We compare the sound speed c profiles of the PZ and standard (std) model in the bottom panel of Fig. 10. When a PZ is present beneath a CZ, ∇ experiences a sharp jump from ∇ ad to ∇ rad (Fig. 10, top panel), resulting in an acoustic "glitch" in the sound speed profile.
In the model shown in Fig. 10, we find H p ≈ 0.082R at the Schwarzschild CZ boundary, and the depth of the penetration zone in Fig. 10 is 0.042R ∼ 0.5H p . The inclusion of this PZ leads to an O(2%) increase in c near the base of the solar convection zone. Helioseismic observations suggest a similar increase below the base of the solar convection zone (e.g., Christensen-Dalsgaard et al. 2011, their Fig. 17). The difference ∆c = c PZ −c std that we see in this stellar model of the Sun (Fig. 10) has the same sign and roughly the same shape. However, the magnitude of the change in c is larger than is observed; literature values include ∆c/c ≈ O(1%) (Bergemann & Serenelli 2014) and ∆c 2 /c 2 ≈ O(0.4%) (Christensen-Dalsgaard et al. 2011), and our sound speed bump is located at a different radius than the observed bump. Other helioseismic studies have argued that that the solar PZ depth cannot be larger than O(0.05 H p ), because larger PZs would result in larger glitches than are detected (see Sct. 7.2.1 of Basu 2016, for a nice review). It is interesting, however, that the width of the PZ in Fig. 10 is strikingly similar to the inferred width of the tachocline (0.039 ± 0.013)R that is reported by Charbonneau et al. (1999).
It is unsurprising that our Boussinesq-based model only qualitatively matches observational constraints for the solar CZ. The solar convection zone is highly stratified (∼14 density scale heights), and we neglected density stratification in this work. Furthermore, the solar model used here is essentially a "stock" MESA model and has obvious disagreements with the solar model S (see Fig. 1 in Christensen-Dalsgaard et al. 2011, where the Schwarzschild base of the CZ is r/R ≈ 0.712, whereas the one in Fig. 10 is at r/R ≈ 0.75). Despite the limitations of this minimal proof of concept, Fig. 10 shows that our parameterization can produce penetration zones in 1D models with measurable acoustic glitches. In a future paper, we will produce more realistic models by building upon our parameterization to include the crucial effects of density stratification. We note briefly that the theory in e.g., Eqn. 44 only knows about integral quantities of the convection and does not therefore know about quantities like the filling factor of upflows and downflows which stratification would modify. We suspect that dynamical differences that arise from including stratification would manifest as changes in f and ξ, but a detailed exploration is beyond the scope of this work.

DISCUSSION
In this work, we presented dynamical simulations of convective penetration, in which convection mixes ∇ → ∇ ad beyond the Schwarzschild boundary. To understand these simulations, we used an integral constraint (reminiscent of Roxburgh 1989) and flux-based arguments (similar to Zahn 1991) to derive a parameterization of convective penetration according to the convective flux and viscous dissipation. In doing so, we have laid down the first steps (Eqns. 14 & 44) towards incorporating convective penetration into stellar structure codes. We parameterized the viscous dissipation into a bulk-CZ portion (f ) and a portion in the extended penetrative region (ξ), and derived predictions for how the height of a penetrative region δ p should scale with these measurable parameters and a new flux-based "penetration parameter" P. We designed and analyzed two sets of simulations which showed good agreement with these theoretical predictions. These simulations differ from past studies because we separately specify P and the stiffness S, and we allow the simulations to evolve for a very long time or use numerical techniques for rapid evolution. We briefly examined what the impliciations of this theory could be for a simple stellar model.
Our simulation results suggest that stellar convection zones could be bounded by sizeable penetration zones. In extreme simulations, we observe penetration zones which are as large as the convection zones they accompany; however, for realistic stellar values (P ≈ 1), we find that they may be as large as 20-30% of the convective zone length scale (∼the mixing length).
The simulations we presented in this work use a simplified setup to test the basic tenets of our theory. In particular, they demonstrate that the shape of the flux near the convective boundary and the viscous dissipation together determine the height of the penetration zone. The precise values of the parameters f and ξ achieved in natural, turbulent, fully compressible, spherical stellar convection may be different from those presented in e.g., Fig. 7 and Eqn. 41 here. Future work should aim to understand how these parameters and the theory presented in e.g., Eqn. 44 change when more realistic effects are taken into account.
Stellar opacities and thus stellar radiative conductivities are functions of thermodynamic variables rather than radial location. The formation of a penetration zone will therefore affect the conductivity profile and ∇ rad , which will in turn affect the location of the Schwarzschild boundary and the estimate of how deep the penetration zone should be. In other words, convective penetration and entrainment both occur in realistic settings, and their combined effects should be studied. Future work should follow e.g., Käpylä et al. (2017) and implement realistic opacity profiles which evolve selfconsistently with the thermodynamic state in order to understand how these effects feedback into one another.
Our simulation setup (in which convection is driven by internal heating and stopped by a radiative flux divergence) most closely imitates core convection in massive stars. Other shell or envelope convection zones in stars are driven entirely by divergences in the radiative flux. These divergences act as radiative heating (at the base of the convection zone) and radiative cooling (at the top of the convection zone). We suspect that our simulation setup (and separate specification of P and S) could straightforwardly be implemented in a model where the total flux is constant with height and convection is driven entirely by changes in k with height. Future work should test this by examining three-layer experiments where a CZ sits between two RZs, and the convection is driven at the base by a decrease in k and then stopped by an increase in k at the top. These experiments would help constrain how penetration zone depths change when two PZs (one above, one below) must be accounted for in the integral constraint.
Our work here assumes a uniform composition through the convective and radiative region. Convective boundaries often coincide with discontinuities in composition profiles (Salaris & Cassisi 2017). Future work should determine if stabilizing composition gradients can prevent the formation of the penetration zones seen here.
Furthermore, stellar fluid dynamics exist in the regime of Pr 1 (Garaud 2021). Dynamics in this regime may be different from those in the regime of Pr 1 that we studied here, which in theory could affect f and ξ. Recently, Käpylä (2021) found that convective flows exhibited more penetration at low Pr than high Pr. Future work should aim to understand whether f and/or ξ depend strongly on Pr in the turbulent regime.
Two other interesting complications in stellar contexts are rotation and magnetism. In the rapidly rotating limit, rotation creates quasi-two-dimensional flows, which could affect the length scales on which dissipation acts and thus modify f . Furthermore, magnetism adds an additional ohmic dissipation term, which could in theory drastically change our hydrodynamical measurement of f .
In summary, we have unified Roxburgh (1989)'s integral constraint with Zahn (1991)'s theory of fluxdependent penetration into a parameterized theory of convective penetration. We tested this theory with simulations and found good agreement between the theory and our simulations. In future work, we will use simulations to test some of the complicating factors we discussed here and aim to more robustly implement convective penetration into MESA.
We thank Keaton Burns, Matt Browning, Matteo Cantiello, Geoff Vasil, and Kyle Augustson for useful discussions and/or questions which improved the content of this manuscript. Ben Brown thanks Jeffrey Oishi for many years of discussions about overshooting convection.
We thank the anonymous referee for carefully reading our manuscript, engaging with our science, and helping identify places where our descriptions of our simulations were confusing. EHA is funded as a CIERA Postdoctoral fellow and would like to thank CIERA and Northwestern University. We acknowledge the hospitality of Nordita during the program "The Shifting Paradigm of Stellar Convection: From Mixing Length Concepts to Realistic Turbulence Modelling," where the groundwork for this paper was set. This work was supported by NASA HTMS grant 80NSSC20K1280 and NASA SSW grant 80NSSC19K0026. Computations were conducted with support from the NASA High End Computing (HEC) Program through the NASA Advanced Supercomputing (NAS) Division at Ames Research Center on Pleiades with allocation GID s2276. The Flatiron Institute is supported by the Simons Foundation.  Fig. 6, the time evolution of simulations which start from a state based on the Schwarzschild criterion can be prohibitively long. In Anders et al. (2018), we explored the long time evolution of simple convective simulations and found that fast-forwarding the evolution of a convective simulation's internal energy and thermal structure can be done accurately. This can be done because the convective dynamics converge rapidly even if the thermal profile converges slowly. This same separation of scales is observed in the penetrative dynamics in this work, and so similar techniques should be applicable.
To more quickly determine the final size of the evolved penetration zones we use the following algorithm.
1. Once a simulation has a volume-averaged Reynolds number greater than 1, we wait 10 freefall times to allow dynamical transients to pass.
3. We linearly fit each of the departure points' evolution against time using NumPy's polyfit function. We assume that convective motions influence δ 0.1 and δ 0.5 more strongly than δ 0.9 . We measure the time-evolution of the convective front dδp dt by averaging the slope of the linear fits for δ 0.1 and δ 0.5 . 4. We take a large "time step" of size τ AE forward. We calculate ∆δ p = τ AE dδp dt .
• If ∆δ p < 0.005, we erase the first 15 time units worth of departure point measures and return to step 2 for 15 time units.
• If ∆δ p is large, we adjust the top of the PZ by setting δ 0.5,new = δ 0.5 t + ∆δ p (angles represent a time average). If |∆δ p | > 0.05, we limit its value to 0.05. We calculate the width of the PZ-RZ boundary layer d w as the minimum of δ 0.9 − δ 0.5 t and δ 0.5 − δ 0.1 t . We adjust the mean temperature gradient to where H is defined in Eqn. 34 and ∆∇ = ∇ rad − ∇ ad . We also multiply the temperature perturbations and full convective velocity field by (1−H(z; 1, 0.05)). This sets all fluctuations above the nominal Schwarzschild convection zone to zero, thereby avoiding any strange dynamical transients caused by the old dynamics at the radiative-convective boundary (which has moved as a result of this process).
In general, the initial profile of T that we use when we start our simulations is given by Eqn. A1 with a value δ 0.5,new ≥ 0. We then evolve T towards a statistically stationary state using the above algorithm and standard timestepping. If a simulation returns to step 2 from step 4 ten times over the course of its evolution, we assume that it has converged near its answer, stop this iterative loop, and allow the simulation to timestep normally. Additionally, in some simulations, we ensure that this process occurs no more than 25 times. This process effectively removes the long diffusive thermal evolution on display in the upper left panel of Fig. 6 by immediately setting the mean temperature profile to the radiative profile above the PZ. In Fig. 11, we plot in black the time evolution of δ p and f in Case I simulations with S = 10 3 , R = 400, and P D = [1, 2, 4]. We overplot the evolution of simulations which use this accelerated evolution (AE) procedure using orange and green lines. Time units on the x-axis are normalized in terms of the total simulation run time in order to more thoroughly demonstrate the evolutionary differences between standard timestepping and AE. However, the AE simulations are much shorter: the vertical green-and-yellow lines demonstrate how long the AE simulation ran compared to the standard timestepping simulation (so for P D = 1, the AE simulations only took ∼ 1/4 as long; for P D = 2, they took ∼ 1/10 as long; for P D = 4, they took ∼ 1/20 as long). AE simulations with orange lines start with PZ heights which are much larger than the final height, while green line solution start with initial PZ heights which are smaller than the expected height. Regardless of our choice of initial condition, we find that this AE procedure quickly evolves our simulations to within a few percent of the final value. After converging to within a few percent of the proper penetration zone height, this AE procedure continues to iteratively "jitter" around the right answer until the convergence criterion we described above are met. These jitters can be seen in the top panels of Fig. 11, where the solution jumps away from the proper answer in one AE iteration before jumping back towards it in the next iteration. If the PZ height continues to noticeably vary on timescales of a few hundred freefall times, we continue to timestep the simulations until the changes of δ p have diminished.
B. MESA IMPLEMENTATION Our 1D stellar evolution calculations were performed using the Modules for Experiments in Stellar Astrophysics software instrument (Paxton et al. 2011.

B.2. Penetration Implementation
Here we describe a first implementation of Eqn. 44 in MESA. We note that this impelementation is likely not universal or robust enough to be used in most complex stellar models, but it is robust enough to time-step stably and produce the results displayed in Sct. 6. Future work should improve upon this model. After converging to within a few percent, the accelerated evolution procedure "jitters" around the equilibrated value. Time units are normalized by the total run time of the simulation. Accelerated simulations were run for tsim = 3000 freefall times. The standard timestepping (black line) simulations were run for tsim = 1.2 × 10 4 (PD = 1), tsim = 3.2 × 10 4 (PD = 2), and tsim = 6.7 × 10 4 (PD = 4) freefall times. The vertical green-and-yellow lines show the total simulation time of the accelerated simulation in terms of the direct simulation time; i.e., the accelerated simulation converged in only ∼ 5% of the simulation time of the direct simulation for P = 4. (Bottom row) Rolling average of f over 200 freefall times, plotted in the same way as δ0.5.
To find the extent of the penetrative region we write Eqn. (44) as (1 − f ) CZ L conv dr = PZ (ξf L conv,avg,CZ + L conv ) dr, where L conv,avg,CZ is the average of L conv in the convection zone and L conv in the penetrative region is given by which is the excess luminosity carried if the temperature gradient in the radiative zone is adiabatic. We first integrate the left-hand side of Eqn. (B2) over the convection zone and further use that to evaluate L conv,avg,CZ . Next we integrate the right-hand side of the same away from the convective boundary into the radiative zone until the equation is satisfied. The point where this integration stops is the edge of the penetrative region.
We then implement convective penetration in stellar evolution with two modifications. First, we add an extra chemical mixing term in the penetration zone with a scale of D ≈ H p (L/4πr 2 ρ) 1/3 , which is roughly the scale of the convective diffusivity. The precise choice of diffusivity here does not matter, as any plausible scale will be enough to eliminate any composition gradient on evolutionary time-scales. Secondly, we override the default routine in MESA for determining ∇ and instead have the solver reduce ∇ a − ∇ by 90 per cent in the penetrative zone.
Using this procedure with f = 0.86 and ξ = 0.6, and timestepping a solar model to the age of the current Sun (∼ 4.5 Gyr), we find the profile displayed in Sec. 6.

B.3. Models
Models were constructed to reasonably reproduce the present-day Sun and based on the 2019 MESA summer school lab by Pinsonneault (2019). Inlists and the run star extras source code are available in a Zenodo repository (Anders et al. 2021).

C. TABLE OF SIMULATION PARAMETERS
Input parameters and summary statistics of the simulations presented in this work are shown in Table 1.  Note-Simulation type is specified as "D" for discontinuous/Case I or "L" for linear/Case II. "D/SF" simulations have stressfree boundary conditions. Input control parameters are listed for each simulation: the penetration parameter P, stiffness S, and freefall Reynolds number R. We also note the coefficient resolution (Chebyshev coefficients nz and Fourier coeficients nx, ny).
We report the number of freefall time units each simulation was run for t sim . Time-averaged values of the departure heights (δ 0.1 , δ 0.5 , δ 0.9 ), the dissipation fraction f , and the dissipation fall-off ξ, as well as the average convection zone velocity u are reported. We take these time averages over the final 1000 freefall times or half of the simulation, whichever is shorter.