A cone restriction estimate using polynomial partitioning

We obtain improved Fourier restriction estimate for the truncated cone using the method of polynomial partitioning in dimension $n\geq 3$, which in particular solves the cone restriction conjecture for $n=5$, and recovers the sharp range for $3\leq n\leq 4$. The main ingredient of the proof is a $k$-broad estimate for the cone extension operator, which is a weak version of the $k$-linear cone restriction estimate for $2\leq k\leq n$.

Theorem 1. For n ≥ 3, the operator E has the estimate E f L p (R n ) f L p (2B n−1 \B n−1 ) , 3n−3 if n > 3 odd, 2 · 3n 3n−4 if n > 3 even. When n = 3 and n = 4, this recovers the sharp range of p (p > 4 for n = 3 by Barcelo [1] and p > 3 for n = 4 by Wolff [16]) for which the cone restriction estimate holds true. When n = 5, Theorem 1 derives for the first time the sharp range p > 8 3 for the cone restriction estimate. While when n ≥ 6, Theorem 1 provides new partial progress towards the sharp range p > 2(n−1) n−2 conjectured by Stein [12]. Before our work, the best known range for n > 4 is p > 2(n+2) n , which is proved by Wolff [16] using bilinear method. Fourier restriction problem on various surfaces with enough curvature is closely related to many problems in analysis such as the Kakeya conjecture, Bochner-Riesz conjecture, as well as the local smoothing conjecture in PDE. It has been extensively studied for decades, for which we refer to [6,9,10] and the references therein in addition to the aforementioned works. However, there are very few surfaces and dimensions for which a sharp restriction theorem is known. For example, the restriction conjecture for the paraboloid remains open for n ≥ 3. It is also known ( [14]) that there is a certain link between the restriction estimate for the cone in R n+1 and that for the paraboloid, sphere, or other conic sections in R n , which suggests possible further applications of our result.
It is also interesting to study L q → L p restriction estimate for q = p. It is conjectured that E : L q → L p whenever p > 2(n−1) n−2 and q ′ ≤ n−2 n p, which is known to be necessary. When q > p, such estimate is immediately implied by Theorem 1 by Hölder's inequality and the support consideration of f . When q < p, Theorem 1 doesn't tell us anything directly, however, one can obtain the following estimate by slightly modifying the proof of Theorem 1.
Theorem 2. For n ≥ 3, the operator E has the estimate E f L p (R n ) f L q (2B n−1 \B n−1 ) , whenever the tuple (p, q, k) is admissible in the sense that q > 2, 2 ≤ k ≤ n and For each fixed n, one can optimize the range of L q → L p restriction estimate above by choosing the best k. In particular, in the case n = 5, taking k = 3, Theorem 2 implies the optimal conjectured range p > 8 3 , q ′ ≤ 3 5 p. The result in the open range of (1.1) follows from a similar argument for Theorem 1. In order to obtain the estimate on the sharp boundary line, we apply in addition a bilinear interpolation with a bilinear cone restriction estimate obtained in [16]. The interpolation argument is adapted from the work of Tao, Vargas and Vega [15], where a paraboloid version is demonstrated.
The main ingredient of the proof of the theorems above is the polynomial partitioning, which is a tool describing the algebraic structure where |E f | is large. The idea of applying polynomial method in harmonic analysis dates back to the solution of the finite field Kakeya problem by Dvir [8], and it was Guth and Katz in [11] who first introduced the tool of polynomial partitioning when solving the Erdös distinct distances problem in combinatorics. It has been recently shown to be an extremely powerful tool in harmonic analysis as well, for instance in obtaining restriction estimates for the paraboloid in the works of Guth [9,10] and in deriving the sharp Schrödinger maximal estimate in R 2 in [7] by Du, Guth and Li, which motivates our present work.
More precisely, polynomial partitioning will be used in Section 2 where we prove a k-broad restriction inequality on the cone (Theorem 3), which is a weak version of the k-linear restriction estimate (2.3). Roughly speaking, k-broad inequality provides an estimate for |E f | after eliminating parts of it that lie within a neighborhood of a few (k − 1)-hyperplanes in R n (see Definition 2.1 in Section 2). The proof of Theorem 3 proceeds in a fairly standard way of polynomial partitioning. Compared to the case of the paraboloid, the cone contains infinitely many straight lines where the curvature vanishes. Such a difference already surfaces in the proof of the k-broad estimate. We leave more detailed discussion in this aspect to Section 2. The main linear restriction theorems will then be obtained by the k-broad estimates, together with decoupling and the Lorentz rescaling.
The article is planned as follows. We first recall several facts concerning wave packet decomposition before ending the introduction. In Section 2, we prove the aforementioned k-broad restriction inequality, which will be applied to obtain the main results Theorem 1 and 2 in Section 3.
When restricted inside a large ball B R centered at the origin with radius R, E f ℓ θ,v is essentially supported on a thin tube T ℓ θ,v of length 1 in the mini direction M(θ), R in the long direction L(θ), and R (1+δ)/2 in the rest of the directions. And the thin tube is one of the ∼ ⌊R (1+δ)/2 ⌋ ones that partition a regular tube R θ,v of length R and radius R (1+δ)/2 . Note that the directions of the tubes are determined by the sector θ, and the mini direction and the long direction are orthogonal to each other. Indeed, let ξ θ = (ξ θ,1 ... , ξ θ,n−1 ) be the point on the central line of θ with |ξ θ | = 1, then the long direction of both the thin and regular tubes is parallel to (ξ θ,1 , ..., ξ θ,n−1 , −1) while the mini direction of the thin tube is parallel to (ξ θ,1 , ..., ξ θ,n−1 , 1). One can write and specify the labeling of the thin tubes so that T ℓ θ,v is the thin tube inside R θ,v that is ∼ ℓR −(1+δ)/2 away from its center in the mini direction. Sometimes we also use the simplified notation T θ,v for T ℓ θ,v when there is no need to specify ℓ. Note that the wave packets f ℓ θ,v are essentially orthogonal, i.e.

A k-BROAD ESTIMATE FOR THE CONE
Fix a large constant R, we decompose 2B n−1 \ B n−1 in the frequency space (ξ) into sectors τ of dimension 1×K −1 ×···×K −1 , where K << R ǫ is a large constant. Write f = τ f τ where f τ is supported in τ, and let G(τ) = θ⊂τ L(θ). Then G(τ) ⊂ S n−1 is contained in a spherical cap with radius ≈ K −1 , representing possible long directions of wave packets in E f τ . For any subspace V ⊂ R n , we adopt the notation Angle(G(τ), V ) for the smallest angle between any non-zero vectors v ∈ V and v ′ ∈ G(τ).
In the physical space (x), we decompose the ball B R ⊂ R n centered at the origin with radius R into small balls B K 2 with radius K 2 , and for each B K 2 ⊂ B R consider B K 2 |E f τ | p for every τ. For a fixed parameter A, define Then for any open set U being a union of some balls B K 2 , we can define the k-broad part In fact, if defined on each B K 2 as a constant multiple of the Lebesgue measure, µ E f can be extended to be a measure on B R . In particular, . Moreover, via the same argument as in Lemma 4.1, 4.2 of [10], there hold a triangle inequality and a Hölder's inequality for the broad norm. We omit the details.
The main result of this section is the following: Theorem 3. For any 2 ≤ k ≤ n and any ǫ > 0, there is a large constant A so that holds for any K and any p ≥p(k, n) := 2 · n+k n+k−2 . Theorem 3 is a weak version of the k-linear cone restriction conjecture, which says that if U 1 , ...,U k ⊂ 2B n−1 \ B n−1 are transversal, i.e. |G(θ 1 ) ∧ ... ∧ G(θ k )| 1 for any choices of θ j ⊂ U j , and f j is supported in U j , 1 ≤ j ≤ k, then This has been proven in [16] and [4] in the case k = 2 and k = n respectively. When 3 ≤ k ≤ n − 1, it is unknown whether the k-linear cone restriction holds true. The only progress towards it that the authors are aware of is due to Bejenaru [3,2], where some sharp (up to the endpoint) k-linear restriction estimate was obtained for a class of hypersurfaces with curvature including (k−1)-conical surfaces using very different method. Even though being a weaker result, (the corresponding version of) the k-broad estimate has been shown by Guth in [9,10] to be sufficient for obtaining linear restriction estimate for the paraboloid. This follows from an adapted argument of Bourgain and Guth [6], where a method converting multilinear restriction estimates into linear restriction estimates is introduced. In this sense, the core power of the k-linear restriction can be captured by the k-broad estimate, which inspires us to take a similar path in our proof and suggests possible further applications in other problems.
In the rest of the section, we prove Theorem 3. Similarly as for the paraboloid, we apply the method of polynomial partitioning, which exploits the algebraic structure of the broad part of |E f |. We will emphasize the difference between the cases of the paraboloid and the cone, while only sketch the part of the proof where the argument for the paraboloid in [10] applies equally well in our problem.
In the following, we first introduce how polynomial partitioning works in R n and an argument that almost proves Theorem 3. We then explain a few issues preventing the argument from working and introduce the remedy for that. Essentially speaking, instead of directly attacking Theorem 3 in R n , one needs to prove a stronger inductive estimate (Theorem 4 below) for all 1 ≤ m ≤ n, which then recovers Theorem 3 at m = n.
2.1. Polynomial partitioning in R n . Fix B R as above. By Theorem 1.4 of [9], for a large constant D to be determined later, there exists a (non-zero) polynomial of degree at most D such that its zero set Z divides where each wave packet in the physical space (x) is essentially supported in T ℓ θ,v , a thin tube of length R, radius R 1+δ 2 and thickness 1. In the simplified model where each T ℓ θ,v is reduced to a line segment, T ℓ θ,v intersects at most D different parts O i , which is much fewer than the total number of O i 's. In other words, the wave packets passing through a fixed O i 0 do not interact much with other O i 's, which works in our favor when we do induction. However, unlike a line segment, a tube T ℓ θ,v might intersect many more O i 's. In order to apply the above heuristic, we need to first thicken Z to a wall W, which is defined as the R and the fact that each T ℓ θ,v intersects at most D cells. A tube can intersect the wall W in two different ways, either cutting across W or nearly tangent to Z.
Fix δ > 0. If a tube intersects W, then we say it crosses W transversely if it is not By triangle inequality of the broad norm, there are three different cases to consider depending on which type of wave packets make the most contribution in µ E f (B R ): 2.1.1. Cellular case. This case can be treated in the same way as for the paraboloid based on the fact that each tube intersects at most O(D) cells. In fact, it holds even more easily since tubes in the cone case are thinner.
By Plancherel, , there exists at least one cell O i (in fact true for most of the cells) such that both of the following estimates hold: We cover O i with finitely many balls of radius R/2 and induct on the radius of the ball B R . The induction closes if p > 2n n−1 . More precisely, If D is chosen sufficiently large, the power of D dominates the implicit constant and the induction is closed.
2.1.2. Transversal case. The transversal case is similar to the cellular case. Cover W with balls {B j } of radius ρ := R 1−δ and notice that T ℓ θ,v ∈ T trans intersects at most D different B j 's. Fix a B j and let E f j : We induct on scales again, Here we have chosen δ so that R δǫ >> D p/2 , hence the last inequality holds. Note that this argument is still the same as the paraboloid problem.

Tangential case.
The tangential case is where the cone restriction problem becomes different from the paraboloid one. Because of the lack of curvature on the straight lines on the cone, we choose to work with wave packets that are thinner than the ones for the paraboloid, which however results in more wave packets lying inside the R 1+δ 2neighborhood of a variety tangentially.
The main strategy in this case is to perform another polynomial partitioning inside Z, look into the cellular, transversal and tangential cases at the next level, and repeat. At each step, the dimension of the variety (denoted as Z again) that the wave packets are tangent to is reduced by 1. And the iteration stops when dim Z < k according to the following lemma.
This strategy almost works except that some issues will appear when we perform polynomial partitioning on a sub-variety of lower dimensions. We describe them in the next subsection and introduce how they can be circumvented.

Polynomial partitioning on a general sub-variety of
is a fixed small parameter for each dimension m, which is chosen so that the induction on scales closes similarly as in the transversal case above. Assume now that all the wave packets of We proceed as in [10] to further polynomial partition on S. After splitting S into O(1) many pieces, we can partition S as if it is R m .
Given a Lebesgue (non singular) measure µ on S, there exists a polynomial of degree bounded by D 2 in m variables, such that the zero set Z of the polynomial divide (The definition of the measure µ here is a little technical. One first partitions S into pieces so that each one is essentially flat, then let µ be the push forward of µ E f under the projection from R n onto the flat piece of S ⊂ R m . We refer the reader to Section 8 of [10] for more details.) Next, let the wall W be the R We end up with three different cases as in the previous subsection: cellular, transversal, and tangential. We would like to proceed as before but run into two issues.
In the cellular case, one notices that as long as 2m m−1 < p, the induction closes as in Subsection 2.1.1. However, it might happen that 2m m−1 ≥ p when m is too small, which requires us to do a stronger induction: with some p m,k > p and some negative power of R on the right hand side of (2.2) so that we obtain the desired estimate for L p when interpolating the obtained L p m,k estimate with the basic L 2 estimate (2.5) below. More precisely, instead of proving Theorem 3 directly, we need to apply the ideas developed in the previous subsection to induct to prove the following estimate, which is similar to Theorem 8.1 in [10].
and large parameterĀ such that the following holds. Let 1 ≤ m ≤ n and Z = Z(P 1 , ..., P n−m ) be a transverse complete intersection with Deg P i ≤ D Z . Suppose that f is concentrated on wave packets from T Z . Then for any 1 ≤ A ≤Ā and radius R ≥ 1, . We refer to Section 5 of [10] for the definition of a transverse complete intersection. Observe that when m = n and Z = R n , by taking A =Ā and p = p n,k one computes −e + 1/2 = 0, which implies Theorem 3. We also remark that for p = 2, Theorem 4 follows quickly from a similar L 2 estimate as in Lemma 3.2 of [10]: By interpolation and Hölder's inequality of the broad norm, Theorem 4 will thus follow from the endpoint case p = p m,k , which we prove by induction.

Remark 2.3.
In fact, when proving the case m = k, one needs to first prove the slightly larger endpoint case p = p m,m + η and then interpolate. This is to make sure that the induction on scales argument treating the cellular case can close. We omit the separate discussion of this special case as the issue can be handled in the exact same way as in Section 8 of [10].
Note that the strategy introduced in Subsection 2.1 applies equally well to obtain Theorem 4. More precisely, we induct on the dimension m, the radius R, and the parameter A. The base case m = k − 1 can be seen to follow from Lemma 2.2. Now suppose the desired estimate holds true if we decrease the dimension m, the radius R, or A. Perform a polynomial partitioning as described at the beginning of this subsection, the cellular case can then be treated similarly as in the previous subsection. We omit the details as it proceeds almost the same as for the paraboloid (see Section 8 of [10]).
For the transversal case, we would like to cover the wall W with balls of radius ρ, and then apply the induction hypothesis. The induction hypothesis is: suppose E f is For the sake of brevity, we abbreviate δ m−1 as δ from now on. There are two barriers preventing us from applying the induction hypothesis directly.
First, E f is only known to be concentrated on the R 1+δ 2 -neighborhood of Z, which is larger than ρ 1+δ 2 . Therefore, one needs to decompose the R 1+δ 2 -neighborhood of Z into different layers of thickness ρ 1+δ 2 so that each layer is a ρ 1+δ 2 -neighborhood of a translate Z b of Z. We also need to do a wave packet decomposition in B ρ , E f | B ρ = (θ,ṽ,l) E fl θ,ṽ + RapDec(ρ) f L 2 , and verify that each small wave packet lies inside a unique layer Z b and that E fl θ,ṽ is ρ 1/2+δ -tangent to Z b . This is true and can be argued similarly as in the paraboloid case: each small wave packet comes from some large wave packets that are even more tangent to Z, so the small wave packet lies entirely in some layer Z b and is also ρ 1/2+δ -tangent to Z b .
Second, notice that ρ ǫ+O(δ)−e+ 1 2 is greater than R ǫ+O(δ)−e+ 1 2 for p n,k ≤ p ≤ p m,k . In order to obtain the correct (negative) power, one needs to find more structure between different layers: L 2 -norms on different layers are the same. We refer to this phenomenon as the equidistribution property. Although this property is already derived in [10] for the paraboloid case, the same argument doesn't apply to the cone since our tubes are thinner and there are different mini directions existing for each wave packet. The justification of equidistribution is the main novelty of our proof, which we present in details in the following subsection. The rest of the proof assuming equidistribution then proceeds in the same way as in [10], which we omit.

Transverse equidistribution estimates.
Intuitively, the property of equidistribution holds true because of the following heuristic: when all the tubes are tangent to an m-dimensional low degree sub-variety Z, the situation is similar to a k-broad restriction problem in R m .
Given a point p = (ξ 1 , ..., ξ n ) on the cone the normal direction n p at p is parallel to (ξ 1 , ... , ξ n−1 , −ξ n ). Fix a ball B of radius R 1/2+δ . Let V be the tangent space of Z at some point in B ∩ Z. Note that it does not matter which point we pick, as shown by Definition 2.1 of tangent tubes. Assume that V is given by the equations Recalling (2.4), define For any function f : 2B n−1 \ B n−1 → C, let f B := (θ,v,ℓ)∈T B,Z f ℓ θ,v , then the support of the lift of f B onto the cone (denoted by F B ) lies inside N R −1/2+δ V + ∩ C: supp F B lies inside N R −1/2+δ V + by the definition of tangential wave packets, and supp F B lying in C is due to the definition of cone restriction.
Remark 2.4. What does N R −1/2+δ V + ∩ C look like? One special case is when V + is tangent to C. As shown in the proof of Lemma 2.5 below, in this case dim V + ∩ C = 1 and N R −1/2+δ V + ∩ C is a R −1/4+2δ -neighborhood of several radial line segments. In general, if V + is tangent to C up to an angle of R −δ ("K −1 " in Lemma 2.5 below), Otherwise, for any ξ ∈ V + ∩ C, the normal vector n ξ of C at ξ satisfies Then by definition, it is straightforward to check that (w, w n ) ∈ V ⊥ , which in particular implies that the angle between w = (w, −w n ) and (w, w n ) lies in the interval π 2 − K −1 , π 2 + K −1 . Then the following hold true for w and ξ = (ξ, ξ n ) ∈ supp F B : Renormalizing such that |w| = 1, the first inequality above shows that w 2 n − 1 2 ≤ O(K −1 ). Combining the last two estimates together we have |ξ ·w| Thus the support of f B must lie in an O(K −1 )-angular neighborhood ofw. In particular, supp f B lies in an O(K −1 )-angular region in 2B n−1 \ B n−1 , hence we are in case b). = µ E f B (B) = 0.
On the other hand, if we are in case a), the following lemma, adapted from the paraboloid (Lemma 6.5 of [10]), says that the L 2 norm of E f B is equidistributed in B along directions transverse to V .
Suppose that B is a ball of radius R 1/2+δ in B R ⊂ R n , and is in case a) of Lemma 2.5. Then for any ρ ≤ R, In the rest of the subsection, we are going to demonstrate that the L 2 norm of the part of the function f restricted in case a) of Lemma 2.5 is equidistributed along the direction of a fixed vector b, the precise statement of which is postponed to Lemma 2.9 below. Unlike the paraboloid problem, we do not have such equidistribution for the entire f , however, (2.6) ensures that the leftover part of f is nonessential as it makes negligible contribution. In order to define the essential part of f that is restricted in case a), we will need to first understand the interaction between wave packet decompositions on the larger ball B R and the smaller ball of radius ρ.
We now fix a ball B ρ (y) centered at y ∈ R n of radius ρ (R 1/2 < ρ < R). Without loss of generality, one can assume that Z ∩ B ρ (y) is roughly flat, i.e. for any point x ∈ Z ∩ B ρ (y), the normal direction of T x (Z) is O(K −2 ) close to a fixed direction. Indeed, let Λ be a K −2net of all directions in R n , then there are at most O(K 2(n−1) ) many points (directions) in Λ. Decompose N R 1/2+δ (Z) ∩ B ρ (y) = U j into O(K 2(n−1) ) many parts, such that for each x ∈ U j ∩ Z, the normal direction of T x (Z) is O(K −2 ) close to a point in Λ. The disjointness of U j and triangle inequality imply that Here E f j : It thus suffices to study each E f j as there are only O(K 2(n−1) ) (constant compared to R) many of them in total.
It will also be convenient for us to translate B ρ (y) to be centered at the origin. Let x = x − y and definef so that E f (x) = Ef (x) (i.e.f is a modulation of f and in particular their L 2 norms agree). In addition to the original wave packet decomposition of f that gives rise to the large tubes T ℓ θ,v of dimension 1×R 1+δ 2 ×···×R 1+δ 2 ×R, there is the following wave packet decomposition off in B ρ (y): is supported in the sectorθ of radius ρ −1/2 and length 1, and its Fourier transform is essentially supported in a thin tube of thickness 1 and radius ρ 1+δ 2 . For each (θ,ṽ,l), Efl θ,ṽ is essentially supported in a thin tube Tl θ,ṽ of length ρ, radius ρ 1/2+δ and thickness 1. This tube is contained in B ρ in thex coordinate while in B ρ (y) in the original x coordinate. Important to our induction on scales argument is the interaction between these two wave packet decompositions on the larger ball B R and the smaller ball B ρ (y) respectively.
The key observation here is that, inside B ρ (y), the intersection of large tubes T ℓ θ,v whose long directions are within a ρ −1/2 angle looks like a medium tube (of thickness 1, radius R (1+δ)/2 , and length ρ), which can also be written as a union of parallel small tubes Tl θ,ṽ . This idea has been exploited in the paraboloid case as well (see Section 7 of [10]), however the cone case is more subtle because of the presence of the mini directions of the tubes.
More precisely, for any point x ∈ B ρ (y) and sectorθ of radius ρ −1/2 , let Tθ ,x be the set of large wave packets T ℓ θ,v that pass through x and satisfy Dist(θ,θ) ≤ ρ −1/2 . Apparently, for any fixed (θ, v), there exists O(1) ℓ such that T ℓ θ,v ∈ Tθ ,x . Therefore, we will drop the dependence on ℓ in the following. Moreover, for every fixedθ, which determines (up to ρ −1/2 ) the long and mini directions of the tubes, in order to exhaust all the tubes passing through B ρ (y), it suffices to choose x from a fixed grid Z × R (1+δ)/2 Z n−2 . Therefore, we change the notation, denoting Tθ ,x as Tθ ,w instead, w ∈ Z × R (1+δ)/2 Z n−2 .
Lemma 2.7. There exists a medium tube Tθ ,w inside B ρ (y) of length ρ, radius R (1+δ)/2 , and thickness 1 such that Tθ ,w is contained in the intersection of all T θ,v ∈ Tθ ,w .
Proof. It suffices to study the intersection of two tubes T θ 1 ,v 1 , T θ 2 ,v 2 ∈ Tθ ,w , since all tubes pass through the same point w. It is shown in the proof of Proposition 8.1 of [5] that, the lift ofθ onto the cone can be inscribed into the ρ −1 -neighborhood of a cylinder. Let π be the projection of the cylinder onto a hyperplane R n−1 along its axis, then π(θ 1 ), π(θ 2 ) are contained in two plates of radius R −1/2 and thickness ρ −1 . By duality, they give rise to two tubes T 1 and T 2 of length ρ and radius R (1+δ)/2 , whose long directions are along π(θ 1 ) and π(θ 2 ) respectively.
Observe that T i is essentially the projection of T θ i ,v i along its mini direction, which is within a ρ −1/2 angle to the axis of the cylinder, hence T θ 1 ,v 1 ∩ T θ 2 ,v 2 has thickness ∼ 1 in the direction of the axis of the cylinder. It thus suffices to show that T 1 ∩ T 2 is of radius R (1+δ)/2 and length ρ. This is obvious the case as their central lines intersect and Dist(θ 1 , θ 2 ) ρ −1/2 implies that T 1 ∩ T 2 must still contain a tube of comparable size.
In particular, the lemma above implies that Tθ ,w is essentially the same as T θ,v ∩B ρ (y) for any T θ,v ∈ Tθ ,w . In addition, Tθ ,w is a disjoint collection that exhausts all T θ,v such that T θ,v ∩ B ρ (y) = .
Let whereTθ ,w is defined as the set of small wave packetsTθ 1 ,ṽ such that small tubeTθ 1 ,ṽ is contained in Tθ ,w and its long direction is within a ρ −1/2 angle from the long direction of Tθ ,w . The collection Tθ ,w is also disjoint and it exhausts allTθ 1 ,ṽ . In particular, the decompositions above are both orthogonal in the sense that there hold From now on, we will focus on the medium tubes since they connect together the two wave packet decompositions. Divide is the union of ball B's in case a) (resp. case b)) as defined in Lemma 2.5. Define f ess := f − Tθ ,w ∩X a = fθ ,w , then where the last step follows from (2.6). It thus suffices to reduce to the study of f ess from this point on.
Fix a medium tube Tθ ,w ∩ X a = . Recall that Tθ ,w is a union of parallel small tubes and an intersection of large tubes whose long directions are within an angle of ρ −1/2 . We then fix a ball B ⊂ Tθ ,w ∩ X a whose radius is R 1/2+δ , and an (n − m)-dimensional vector b with |b| ≤ R 1/2+δ and orthogonal (up to O(K −1 )) to T x (Z) for some x ∈ B ρ ∩ Z. Note that all large and small tubes that touch Tθ ,w intersect B. We claim that the L 2 norm of f ess is equidistributed along the direction of b, which will conclude the proof of Theorem 3.
Before we start, note that we choose b uniformly over all balls B, which is possible since we have reduced to the case that the variety Z is roughly flat up to O(K −2 ). In the paraboloid setting dealt with in [10], one can choose b's freely in each B, since equidistribution in the physical space (Lemma 6.5 of [10], the analog of our Lemma 2.6) holds true on each B. We unfortunately do not have the luxury with the cone. In fact, it can be as bad that only one B here has equidistribution. A key observation is that this is already good enough for us. Essentially speaking, the equidistribution of E f ess in the physical space concluded in Lemma 2.6 will give rise to that of f ess L 2 in the frequency space, and after breaking f ess down into the orthogonal pieces fθ ,w 's, each of which corresponds to a fixed medium tube Tθ ,w and a choice of B ⊂ Tθ ,w ∩ X a , the behavior of E fθ ,w inside B will control fθ ,w L 2 . We state this last observation as the following lemma, which is borrowed from the paraboloid case: Lemma 3.4 of [10]. Being a direct corollary of orthogonality of wave packets and Plancherel, it works in the cone case just as well. Lemma 2.8. Suppose that f is a function concentrated on a set of wave packets T and that for every T θ,v ∈ T, T θ,v ∩ B r (z) = for some radius r ≥ R 1/2+δ . Then E f 2 L 2 (B 10r (z)) ∼ r f 2 L 2 . This immediately implies for any Tθ ,w ∩ X a = and B ⊂ Tθ ,w ∩ X a that . Along the direction of b, we decompose N R 1/2+δ (Z) ∩ B ρ into layers of thickness ∼ ρ 1/2+δ . Observe that for any (θ,ṽ) in the wave packet decomposition on B ρ , ifTθ ,ṽ intersects N ρ 1/2+δ (Z +b)∩B ρ , thenTθ ,ṽ is contained in N 2ρ 1/2+δ (Z +b)∩B ρ andTθ ,ṽ is tangent to Z +b in B ρ . Define Then observing that modulation doesn't change the L 2 norm and applying Lemma 2.8 again, for any Tθ ,w ∩ X a = and B ⊂ Tθ ,w ∩ X a there holds Applying (2.8), Lemma 2.6, and then (2.7), we obtain the equidistribution for each fθ ,w : which then implies that of the f ess . Lemma 2.9. Let f ess and b with |b| ≤ R 1/2+δ be defined as above, then Proof. By orthogonality, f ess 2 By definition, each Tθ ,w appearing in f ess intersects at least one ball B of case a). Therefore, we can find a B ⊂ Tθ ,w ∩ X a for each fθ ,w such that (2.9) holds true, which, together with orthogonality, completes the proof.

k-BROAD ESTIMATE IMPLIES LINEAR RESTRICTION
3.1. L p → L p restriction. In this subsection, we demonstrate how Theorem 3, the kbroad estimate, implies the main Theorem 1, the linear cone restriction estimate. The first ingredient of the argument is a decoupling result for cone derived by Bourgain and Demeter in [5], and the second one is an induction on scale argument. The main difference of the proof from the paraboloid setting lies in the second step, where a Lorentz rescaling is applied to the cone.
More precisely, we are going to prove for any R > 0 and p ≤ q ≤ ∞ that The upper bound of the range for p comes from the requirement in the decoupling theorem below. Note that the lower bound p(k, n) is different from the critical indexp(k, n) = 2 · n+k n+k−2 in Theorem 3. We claim that (3.1) implies Theorem 1 immediately. Indeed, taking k = n+1 2 when n is odd and k = n 2 + 1 when n is even, max(p(k, n),p(k, n)) gives the lower bound for p in Theorem 1. Then, Theorem 1 follows by interpolating with the trivial L ∞ bound of E and ǫ-removal ( [13]).
We now begin the proof of (3.1). The k-broad estimate assumption says that where V 1 , ...V A are (k − 1)-planes and we have used the abbreviation τ ∉ V a to denote Angle(G(τ), V a ) > K −1 , a = 1, ... , A. For each B K 2 , fix a choice of V 1 , ... , V A so that the minimum above is attained. Then, where the first term can be bounded using the k-broad estimate, while the second term will be handled by the cone decoupling theorem of Bourgain and Demeter, which in our notation states the following.

Theorem 5 ([5]
). Assume suppf ⊂ N K −2 (C), the K −2 -neighborhood of the truncated cone C ⊂ R n . Then on any ball B K 2 of radius K 2 , for any δ > 0, and W B K 2 is a weight approximately equaling to 1 on B K 2 and rapidly decaying outside.
Applying this theorem to the second term followed by Hölder's inequality, one obtains where we have observed that the number of τ ∈ V a is max(1, K k−3 ). Summing over B K 2 ⊂ B R , a = 1, ... , A, one has where W := B K 2 ⊂B R W B K 2 satisfies W 1 on B 2R and rapidly decays outside B 2R . Hence, combining with the k-broad estimate, from which we are going to prove by induction on the radius that This is obviously true when R = 1 by the trivial L ∞ bound of E. Assume now that (3.3) holds for radii less than R/2. We apply Lorentz rescaling to handle the contribution of each f τ . To do this, we first observe that our desired estimate (3.3) is preserved under rotations. To see this, it is easier to work with the "lift" of the functions f on R n−1 onto the cone. For any f ∈ L q (2B n−1 \ B n−1 ), define F(ξ) = f (ξ) as a function supported on cone C, then where dσ C is the pull back of the Lebesgue measure on R n−1 under the projection ξ →ξ, and (3.3) can be rephrased as Lemma 3.1. Let F be a function supported on the cone C, and A be any rotation in R n , then the following two inequalities are equivalent: for any set Ω ⊂ R n .
Proof. By change of variables, since the Jacobian of the rotation is 1, the LHS of both inequalities are the same, so as the RHS.
Now we start estimating each f τ , or equivalently, its lift F τ , and we use τ to denote both the sector in 2B n−1 \ B n−1 or its lift onto the cone at the same time. For a τ fixed, by symmetry of the cone, there is no loss of generality in assuming that the central line of τ is in the first quadrant of the (ξ n−1 , ξ n )-plane. (This can be achieved through a rotation fixing the ξ n -axis, mapping C to itself and the central line of τ into the (ξ n−1 , ξ n )-plane, combined with Lemma 3.1.) Next, we want to find a rotation A sending the central line of τ to be lying on the positive half of the ξ n−1 -axis. In fact, A is exactly the volume conserving linear transformation mapping ξ = (ξ, ξ n ) to ω = (ω, ω n ) such that ξ n−1 = (ω n−1 − ω n )/ 2, ξ n = (ω n−1 + ω n )/ 2 and ξ j = ω j , j = 1 ... , n − 2, under which the original vertical cone C = ξ ∈ R n : ξ 2 1 + ··· + ξ 2 n−1 = ξ 2 n , ξ n > 0, 1 ≤ ξ j ≤ 2, ∀1 ≤ j ≤ n − 1 is mapped to the tilted cone By a change of variable, where dσ T is the push forward of dσ C under the rotation A.
Observing that {ω : ω ∈ τ} is contained in a constant dilation of the tilted cone T , say, 5T , there holds is a function on the dilated cone 5T such that G ′ τ (ω) = G τ (ω) on the dilated Aτ and 0 elsewhere. We then apply A −1 to rotate 5T back to the vertical position, which leads to . We now end up with a restriction problem on 5C, while in the physical space the linear transformation has sent the ball B R into a tube of dimension RK −1 ×···×RK −1 ×RK −2 ×R. There is still an obstruction preventing us from directly using the induction assumption: this tube is not contained in a ball of radius less than R/2. Fortunately, this can be easily overcome by covering the tube with no more than C 0 balls of radius R/C 0 , 1 < C 0 << K . The fact that one cannot reduce to O(1) many balls of radius CRK −1 shows a key difference between the cone problem and the case of the paraboloid. For each ball B R/C 0 , one can also assume that it is centered at the origin, as translation in the physical space corresponds to modulation in the frequency space which doesn't change the L q norm. By the symmetry of B R/C 0 and Lemma 3.1, the induction assumption implies that Then, collecting the equalities above, (3.5) Plugging this back into (3.2), one has Observe that there are K n−2 sectors τ in total and recall that p ≤ q, Hölder's inequality implies that Plugging this into the inequality above, one has where the dependence on q in the exponent of K cancels out. Then the induction closes if the exponent (excluding δ) of K is strictly negative (so that one can always choose δ > 0 small enough to keep the exponent negative). Note that the presence of C 0 will not harm us, as for any fixed ǫ, it makes negligible contribution when K is sufficiently large. When 2 ≤ k ≤ 3, when k > 3, These give exactly the desired lower bound p(k, n), as claimed in (3.1).
3.2. L q → L p restriction. This subsection is devoted to the proof of Theorem 2, again, using the k-broad estimate Theorem 3.

3.2.1.
Interior of (1.1). When the pair (p, q) lies strictly inside the interior of the claimed range in (1.1), the estimate follows from a very similar argument as in the L p → L p case, so we only sketch the necessary modifications that is needed for our current case q < p. More precisely, fix any R > 0, when 2 ≤ q ≤ p ≤ 2n n−2 , Theorem 3 tells us that We are going to show that there holds for some 2 ≤ k ≤ n.
As in the previous subsection, we start with estimating the "broad" part of E f by Theorem 3 and treating the "narrow" part of it using the decoupling theorem of Bourgain-Demeter. After decoupling, we apply Hölder's inequality to change from ℓ 2 to ℓ q , reaching the estimate Summing over a = 1 ... , A and then B K 2 ⊂ B R using Minkowski inequality, where we have summed up W B K 2 to a single weight W as in the previous subsection. This gives us a slightly different form of (3.2): We then proceed in the exact same way as in the previous subsection to apply induction on scales to get (3.5). Without the need of using Hölder's inequality, one can plug it into (3.8) to directly obtain These are exactly the desired conditions in (3.7).

Remark 3.2.
In the case q = 2, the range of tuples (p, q, k) in (3.7) is empty for all 2 ≤ k ≤ n, which explains the extra restriction one needs to put on q in the admissibility condition (1.1). Moreover, the elimination of the endpoint of the range of p follows from ǫ-removal.

3.2.2.
Boundary of (1.1). In the previous subsection, we have already obtained the desired linear restriction estimate for all pairs (p, q) that lie strictly inside the claimed range (1.1), it is thus left to examine the boundary case q ′ = n−2 n p when k = 2 and p = n 2n−k−1 2 − n−k+1 q when k > 3.
In order to do this, we apply a bilinear interpolation adapted from Theorem 2.2 of [15] where the case of the paraboloid is studied. The key idea here is that linear and bilinear restriction estimates are essentially equivalent on the boundary line of (1.1). Theorem 6. Let n ≥ 3 and 1 < p, q < ∞ be such that 2p > 2(n−1) n−2 and q ′ ≤ n−2 n · 2p. Let R(q → 2p) denote the linear cone restriction estimate E f L 2p (R n ) f L q and R(q × q → p) denote the bilinear cone restriction estimate for all functions f i supported in U i ⊂ 2B n−1 \ B n−1 such that U 1 ,U 2 are transversal. Then, (1) R(q → 2p) implies R(q × q → p); (2) if R(q×q →p) holds for all (p,q) in a neighborhood of (p, q), then R(q → 2p) holds.
It seems that this theorem has not been explicitly stated in the literature before, but the proof of which is very similar to Theorem 2.2 of [15]. In particular, the direction of linear implying bilinear restriction simply follows from Hölder's inequality. In order to conclude linear restriction from the bilinear restriction, one partitions the cone into sectors at different scales and explore the quasi-orthogonality between pairs of sectors that are close to each other at each scale, which follows from the bilinear restriction estimate after applying Lorentz rescaling as in Subsection 3.1 above. This then yields enough decay for all the scales to be summable. In fact, when n ≥ 4, the proof proceeds exactly the same as in Theorem 2.2 of [15] after replacing n by n − 1. When n = 3, one needs to work through the argument separately as the case n − 1 = 2 is not covered in their theorem, but there is no new difficulty that arises. We omit the details. Therefore, given 2 ≤ k ≤ n and a point (p, q) on the boundary of the region (1.1), it suffices to find a neighborhood of (p, q) where the bilinear restriction holds true. Such a neighborhood can be found by interpolating the bilinear restriction in the interior of (1.1) that is implied by the linear estimate, together with the following theorem of Wolff.

ACKNOWLEDGMENTS
The authors would like to thank Larry Guth for suggesting the problem and multiple enlightening discussions, as well as for carefully reading through a first draft of the article. The authors are also indebted to Ciprian Demeter, Terence Tao