Minimal area of Finsler disks with minimizing geodesics

We show that the Holmes--Thompson area of every Finsler disk of radius $r$ whose interior geodesics are length-minimizing is at least $\frac{6}{\pi} r^2$. Furthermore, we construct examples showing that the inequality is sharp and observe that the equality case is attained by a non-rotationally symmetric metric. This contrasts with Berger's conjecture in the Riemannian case, which asserts that the round hemisphere is extremal. To prove our theorem we discretize the Finsler metric using random geodesics. As an auxiliary result, we show that the integral geometry formulas of Blaschke and Santal\'o hold on Finsler manifolds with almost no trapped geodesics.

inequality valid in all dimension is due to Berger [Ber80] who showed that the volume of every closed Riemannian n-manifold M satisfies vol(M ) ≥ α n inj(M ) π n (1.1) where α n is the volume of the canonical n-sphere. Furthermore, equality holds if and only if M is isometric to a round sphere. The two-dimensional case was proved earlier in [Ber76]. A long standing conjecture in Riemannian geometry also due to Berger asserts that every ball B(r) of radius r An account on isoembolic inequalities and Berger's conjecture is given in [CK03,§6]. A non-sharp volume estimate vol B(r) ≥ c n r n was established by Berger [Ber76], [Ber77] for n = 2 or 3, and by Croke [Cro80,Proposition 14] for every n. The conjecture (with a sharp constant) is satisfied for metrics of the form ds 2 = dr 2 + f (r, θ) 2 dθ 2 in polar coordinates when n ≥ 3; see [Cro83]. In [Cro84], Croke also showed that the optimal inequality (1.3) holds true on average over all balls B(r) of M . In the two-dimensional case, the best general estimate area D(r) ≥ 8−π 2 r n can be found in [Cro09]. The lower bound (1.3) on the area of D(r) has recently been obtained in [Cha+17] by Chambers-Croke-Liokumovich-Wen under the stronger hypothesis that r ≤ 1 2 conv(M ), where conv(M ) is the convexity radius of M . (This implies that r ≤ 1 4 inj(M ), since conv(M ) ≤ 1 2 inj(M ).) Note, however, that this stronger condition rules out the possibility that D(r) is a hemisphere of intrinsinc radius r, which is the only expected equality case of (1.3).
The condition that r ≤ 1 2 inj(M ) in Berger's conjecture (1.2) can be relaxed by requiring instead that every interior geodesic in B(r) is lengthminimizing. The results of [Ber76], [Ber77] and [Cro80,Proposition 14], for instance, still hold under this more general condition.
In this article, we consider the case of disks with a self-reverse Finsler metric whose interior geodesics are length-minimizing. (A precise definition of Finsler metrics and area can be found in Section 2.) It is natural to expect that the inequality (1.3) holds in this setting. This is the case for isosystolic inequalities on the projective plane, where the canonical round metric minimizes the systolic area among both Riemannian and Finsler metrics; see [Iva02], [Iva11]. However, we show that the round hemisphere is not area minimizing among Finsler metric disks of the same radius whose interior geodesics are length-minimizing. More precisely, we establish a sharp isoembolic inequality for Finsler metrics in the two-dimensional case under the assumption that every interior geodesic is length-minimizing. We observe that the extremal metric is not Riemannian and, surprisingly, not even rotationally symmetric.
Before stating our main result, let us introduce the following definition.
Definition 1.1. A Finsler disk D of radius r with minimizing interior geodesics is a disk with a Finsler metric such that • every interior point of D is at distance less than r from a specified center point O; • every point of ∂D is at distance exactly r from O; • every interior geodesic of D is length-minimizing. For instance, a ball of radius r on a complete Finsler plane with no conjugate points is a Finsler disk of radius r with minimizing interior geodesics.
The optimal version of Berger's conjecture for Finsler surfaces with selfreverse metric is given by the following result. We emphasize that we make no assumptions on the convexity radius. Furthermore, the inequality is optimal.
The lower bound is attained by a non-smooth space consisting of a disk of radius r centered at the tip of the cone obtained by gluing together three copies of a quadrant of the 1 -plane. (Recall that the 1 -plane is the normed plane where unit balls have the least possible area, according to Mahler's theorem on convex bodies in the plane.) Note that this disk is not rotationally symmetric. In Section 11 we use Busemann's construction of projective metrics (developed in relation with Hilbert's fourth problem) to give another description of this non-smooth extremal metric. More precisely, we define a non-smooth projective metric on the plane where the disk of radius r centered at the origin has area 6 π r 2 . Then, we approximate this non-smooth projective metric by smooth projective metrics (which are therefore Finsler and have minimizing interior geodesics) where the area of the disk converges to 6 π r 2 , proving that the inequality of Theorem 1.2 is sharp. Let us further comment on the result proved in [Cha+17] for Riemannian disks D(r) ⊆ M of radius r ≤ 1 2 conv(M ). As previously mentioned, this excludes the possibility that D(r) is a hemisphere of intrinsinc radius r. Still, the argument in [Cha+17] is valid for Finsler surfaces with self-reverse metric, except for the proof of their Lemma 2.1, which is purely Riemannian. Therefore, a self-reverse Finsler metric disk of radius r in which the distance function from each given point is convex along all geodesics satisfies (1.3). The extremal surfaces that we construct in this paper violate this inequality, however this poses no contradiction because they have a vanishing convexity radius.
Instead of the Holmes-Thompson area, one could consider the Busemann-Hausdorff area, which, in general, is bounded below by the former; see [Dur98]. However, the Busemann-Hausdorff area of the extremal metric in Theorem 1.2 is equal to 3 4 πr 2 , which is greater than the area of the round hemisphere of intrinsic radius r that is conjectured to be minimal.
The proof of Theorem 1.2 and the construction of extremal and almost extremal metrics occupy the whole article. The approach, based on a discretization of the metric (cf. [Cos18]), is fairly robust and new in this context.
The article is organized as follows.
In Section 2, we recall the notions of Finsler manifolds, their Holmes-Thompson measure, and their geodesics described from the Hamiltonian point of view.
In Section 3, we go over the standard proofs of the integral geometry formulas of Blaschke and Santaló, showing that they are valid for Finsler manifolds with almost no trapped geodesics. In the case of a disk as in Theorem 1.2, the formulas say that the length of a curve in the disk is proportional to the expected number of intersections with a random geodesic, and the area of a region is proportional to the expected length of the intersection with a random geodesic.
In Section 4, we introduce the notion of a quasi wall system on a surface, generalizing the wall systems studied in [Cos18]. A quasi wall system on a surface is a 1-dimensional submanifold satisfying certain conditions. It determines a discrete metric, according to which the length of a curve is its number of intersections with the quasi wall system, and the area of the surface is the number of self-intersections of the quasi wall system. We show how to approximate a self-reverse Finsler metric with minimizing geodesics by a quasi wall system consisting of random geodesics. To prove the approximation properties we use the integral geometry formulas to compute the expected values of discrete length and area, and then we apply the law of large numbers.
In Section 5, we use this approximation result to show that Theorem 1.2 follows from an analogous theorem on simple discrete metric disks.
Sections 6, 7, 8 and 9 are devoted to the proof of this discrete theorem. The proof is based on identifying certain configurations on a quasi wall system and operating on these configurations in order to transform a simple discrete disk into a new one of less area. When the disk has minimum area, none of these configurations is present, and this implies that the quasi wall system is of a special kind where we can compute a lower bound for the area.
In Section 10, we construct a simple discrete disk of minimal discrete area and show that it is unique up to isotopy.
In Section 11, we use Busemann's construction of projective metrics to obtain continuous versions of our discrete area-minimizing disk.
Finally, Section 12 is an appendix where we show that on a Finsler surface with boundary, distance-realizing curves are C 1 .
Acknowledgment. The first author thanks the Laboratoire d'analyse et de mathématiques appliquées at the Université Gustave Eiffel/Université Paris Est Créteil and the Groupe Troyanov at theÉcole Polythechnique Fédérale de Lausanne for hosting him as a postdoc while this work was done. The second author would like to thank the Fields Institute and the Department of Mathematics at the University of Toronto, where part of this work was accomplished, for their hospitality. The authors thank the referees for their comments, which helped improve the exposition.

Finsler metrics and Holmes-Thompson volume
In this section, we recall basic definitions of Finsler geometry. (1) Positive homogeneity: F x (tv) = t F x (v) for every v ∈ T x M and t ≥ 0.
(5) Strong convexity: for any two linearly independent vectors v, w ∈ T x M , the Hessian value q v (w) = d 2 Additionally, a Finsler metric F may be or not be (6) Self-reverse: Equivalently, one could define a Finsler metric by replacing (3) and (5) with the condition that for every nonzero vector v ∈ T M , the Hessian of F 2 at v is positive definite; see [Cos20].
In each tangent space T x M , the unit ball and unit sphere determined by the norm F x are Similarly, in the cotangent space T * x M , the norm F * x dual to F x determines a unit co-ball B * x M and a unit co-sphere U * x M .
Remark 2.2. To handle technical details in case M has nonempty boundary, we extend the metric F to a manifold M + ⊇ M , of the same dimension as M but without boundary.
Definition 2.3. Let M be a manifold with a Finsler metric F . The length of a piecewise-C 1 curve γ : I → M is defined as the integral of its speed F (γ (t)), that is, and the distance d F (x, y) between two points x and y in M is the infimum length of a curve γ in M joining x to y.
A distance-realizing curve is a curve γ : for every t < t . A geodesic of M is a smooth, unit-speed curve γ : I → M that is extremal for the length functional. In case M has boundary, the extremality is defined by considering variations in M + , see Remark 2.2. Thus the geodesics of M are the geodesics of M + that are contained in M . Equivalently, the geodesics of M are the unit-speed curve curves that satisfy the Euler-Lagrange equation for the length functional; see Definition 2.6 below for an explicit equation in terms of momentum.
In a compact connected Finsler manifold, every pair of points are joined by a distance-realizing arc. 1 A distance-realizing arc contained in the interior of M is necessarily a geodesic and is therefore smooth. However, a distance-realizing arc of M does not necessarily lie in the interior of M , even if its endpoints do. Still, if the manifold is two-dimensional, then every distance-realizing arc is C 1 and has unit speed; see Theorem 12.1. Thus, in a compact Finsler surface, every pair of points x, y are joined by a C 1 arc of length d F (x, y).
) for every ξ ∈ T * M and V ∈ T ξ T * M , where π : T * M → M is the canonical projection. The standard symplectic form ω M on T * M is given by 1 A proof for more general, complete self-reverse metrics is given [Gro07,§1.12]; see also [Men14,Theorem 9.1] for directed metrics.
Using canonical coordinates (x i , ξ i ) on T * M , these forms can be expressed as Definition 2.5. Let (M, F ) be a Finsler manifold. The Legendre map is defined as follows: the image of a unit vector v ∈ U x M is the unique unit covector ξ ∈ U * x M such that ξ(v) = 1. Since F is strongly convex, the Legendre map is a diffeomorphism. Its inverse is the Legendre map associated to the dual metric F * on T * M , which is also strongly convex. The unit covectors will also be referred to as momentums. The Hamiltonian lift of a unit-speed curve γ in M is the curve t → L(γ (t)) in U * M .
Definition 2.6. The cogeodesic vector field of a Finsler manifold M is the vector field Z on U * M given by the equations where ι Z is the operator that contracts a differential form with the vector field Z. The integral curves of Z are the Hamiltonian lifts of the geodesics in M ; see [Cos20].
It follows from the Cartan formula that the forms α M and ω M restricted to U * M are invariant under the cogeodesic flow.
2.4. Holmes-Thompson volume. We will consider the following notion of volume.
Definition 2.7. The Holmes-Thompson volume of a Finsler n-manifold M is defined as the symplectic volume of its unit co-ball bundle B * M ⊆ T * M , divided by the volume n of the Euclidean unit ball in R n . That is, where ω M is the standard symplectic form on T * M and 1 n! ω n M = 1 n! ω M ∧· · ·∧ ω M is the corresponding volume form. Equivalently (see Proposition 3.11), the Holmes-Thompson volume is given as an integral over the unit sphere bundle by the formula The factor 1 n ensures that for Riemannian metrics, the Holmes-Thompson definition of volume agrees with the conventional Riemannian definition.

Integral geometry in Finsler manifolds with almost no trapped geodesics
The goal of this section is to present versions of two classical formulas in integral geometry, namely the formulas of Blaschke [Bla35] and Santaló [San52;San76], which are in turn generalizations for manifolds of the classical Crofton formulas on the Euclidean plane. In [ÁB06], Blaschke's formula is proved for Finsler manifolds whose space of geodesics is a smooth manifold. Here, we give slightly more general versions which hold for Finsler manifolds with almost no trapped geodesics (and, in particular, for compact Finsler manifolds with minimizing interior geodesics). The proofs mimick those given by Blaschke, Santaló, andÁlvarez-Paiva-Berck. However, we give them in full in order to provide additional details and introduce the few extra steps needed for the generalization.
Definition 3.1. Let M be a Finsler n-manifold with nonempty boundary. A traversing geodesic of M is a maximal geodesic γ : [0, (γ)] → M which does not intersect ∂M , except at its endpoints where it meets the boundary transversely. The Finsler manifold M has almost no trapped geodesics if for almost every unit tangent vector v ∈ U M , the maximal geodesic γ v defined by γ v (0) = v reaches the boundary of M in the future and in the past, that is, γ v (t) ∈ ∂M for some t ≥ 0 and some t ≤ 0.
For instance, a compact Finsler manifold with minimizing interior geodesics has almost no trapped geodesics. Another example is obtained by taking a closed Finsler manifold with ergodic geodesic flow and removing a smoothly bounded nonempty open set.
As we will explain below, the space Γ of traversing geodesics of M is a (2n − 2)-dimensional manifold admitting a natural symplectic structure, whose corresponding natural volume measure is denoted by µ Γ ; see Definition 3.6.
Theorem 3.2 (Blaschke's formula). Let M be a Finsler n-manifold with almost no trapped geodesics. Then the Holmes-Thompson volume of an immersed hypersurface N ⊆ M is equal to where #(γ ∩ N ) is the number of times that γ intersects N . Similarly, the Holmes-Thompson volume of a co-oriented immersed hypersurface N ⊆ M is equal to where #(γ ∩ + N ) is the number of times that γ intersects N transversely in the positive direction.
In equation (3.1), we can restrict the integral to geodesics γ ∈ Γ which are transverse to the hypersurface N since the geodesics γ ∈ Γ which are tangent to N form a subset of zero measure; see Proposition 3.7.(3).
Remark 3.3. Since every traversing geodesic intersects ∂M positively exactly once, we derive from (3.2) that the total measure of the space Γ is In the case of Finsler surfaces with self-reverse metric, the Blaschke and Santaló formulas specialize as follows.
Corollary 3.5. Let M be a self-reverse Finsler metric surface with almost no trapped geodesics. Then the length of any immersed curve c in M is and the Holmes-Thompson area of any smoothly-bounded domain D ⊆ M is The equation (3.6), obtained from (3.5) and (3.4), will be called the San-taló+Blaschke formula. In deducing this formula, we use the hypothesis that the metric is self-reverse when we equate the length of a geodesic with its Holmes-Thompson measure. In general, the Holmes-Thompson measure of a curve is the average of its forward and backward lengths.
The rest of this section is dedicated to describing the symplectic structure on Γ and proving Theorems 3.2 and 3.4.
3.1. Symplectic manifold of traversing geodesics. Let M be a Finsler n-manifold. Recall that Γ is the space of traversing geodesics of M . This space Γ is a (2n − 2)-dimensional manifold parameterized by the initial vectors γ (0) ∈ U M | ∂M of the geodesics γ : [0, (γ)] → M of Γ. Note that the length (γ) depends smoothly on γ ∈ Γ. Consider the surjective submersion taking any momentum ξ ∈ U * Γ M to the geodesic γ ∈ Γ that it generates. The fibers of π Γ are the Z-orbits corresponding to the traversing geodesics.
There exists a unique 2-form ω Γ on Γ such that This follows from the invariance of the 2-form ω M | U * Γ M under the cogeodesic flow, and the fact that this form vanishes in the direction of Z according to Definition 2.6. (See also [Cos20] for details, or [AM78, §4.3] for a general account on symplectic reduction.) The form ω Γ is symplectic, thus it determines on Γ a smooth volume measure µ Γ given by 3.2. Non-traversing geodesics are negligible. We will need the following result in order to establish our versions of Blaschke's and Santaló's formulas. This feature is not required in the previous versions and necessitates the manifold to have almost no trapped geodesics.
Recall that a subset A of a manifold X is negligible in X if the image of A in any local chart of X has zero measure. 3.3. Manifold of positive momentums across a hypersurface. We will need the following notion in the proof of Blaschke's formula. Consider the restriction map to the interior Int(B * N ) of the unit co-ball bundle B * N of N .
The following statement can be found in [ÁB06,Lemma 5.4]. We simply provide the details of the proof.
Lemma 3.9. The space C * N is a symplectic submanifold of T * M and the restriction map Proof. Let ξ ∈ C * N with basepoint x ∈ N . By definition, the norm of ξ is 1, so the norm of its restriction ξ to T x N is at most 1. Furthermore, by strong convexity of F * x , the linear form ξ attains its maximum only at its Legendre-dual unit vector, which is positive and thus not contained in T x N . Therefore, ξ < 1 and the restriction map ρ N takes values in Int(B * N ).
To see that ρ N is a diffeomorphism, we employ local coordinates (x i ) 1≤i≤n in M so that the hypersurface N is given by the equation x n = 0. Let (x i , v i ) i and (x i , ξ i ) i be the corresponding coordinates in T M and T * M . In terms of these coordinates, the operator ρ N acts by supressing the last coefficient, that is, if ξ = (ξ i ) 1≤i≤n , then ξ = (ξ i ) 1≤i≤n−1 . Hence ρ N is smooth.
To prove that ρ N is bijective, consider a covector ξ = (ξ i ) 1≤i≤n−1 ∈ Int(B * x N ) and denote its norm λ = ξ < 1. The covectors ξ ∈ T * x M such that ξ| TxN = ξ are of the form ξ t = (ξ 1 , . . . , ξ n−1 , t) with t ∈ R. Consider the function t → ξ t , where · is the norm F * x on T * x M that is dual to F x . This function is bounded below by λ, and by the Hahn-Banach theorem, this lower bound is attained at some t 0 ∈ R. Furthermore, since the norm F * x is strongly convex, the set of values of t such that ξ t ≤ 1 is a compact interval [t − , t + ] that contains t 0 in its interior, and ξ t = 1 if and only if t = t ± . Thus we are left with two candidates ξ t ± that are the only unit covectors ξ whose restriction to T x N is ξ .
We claim that ξ t + is positive (and ξ t − is negative). That is, the vector that is in Legendre correspondence with ξ t + (i.e., the unit vector where ξ t + attains its norm) is positive. Indeed, when t = t 0 , the covector ξ t , as a function B x M → R, is bounded above by λ. As t increases towards t + , the coefficient ξ n increases, and thus the values of ξ t (v) for v on the negative side decrease (hence they are < λ). Thus, any functional ξ t with t > t 0 , restricted to the ball B x M , must attain its maximum value ξ t (which is > λ) on a positive vector, as required. This shows that ξ t is positive if t > t 0 (and, similarly, ξ t is negative if t < t 0 ). We conclude that ξ t + is the only positive unit covector ξ whose restriction to T x N is ξ . This proves that ρ N is bijective. Additionally, t + depends smoothly on ξ by the implicit function theorem. This finishes the proof that the restriction map Let us show that ρ N * α N = α M | C * N . In canonical coordinates, the tautological 1-form α M on T * M is written as α M = n i=1 ξ i dx i . In restricting to C * N , the last term vanishes because x n = 0 on N , thus the restricted form can be written as α M | C * N = n−1 i=1 ξ i dx i . On the other hand, the tautological 1-form of N is α N = n−1 i=1 ξ i dx i , and this expression is unchanged by the pullback ρ N * since the map ρ N : C * N → Int(B * N ) acts simply by suppressing the coordinate ξ n . We conclude that ρ N * α N = α M | C * N . Taking the exterior differential of this expression, we obtain ρ N * ω N = ω M | C * N . This implies that C * N is a symplectic submanifold of T * M .
3.4. Coarea formula and fiber integration. In the proofs of Blaschke's and Santaló's formulas, we will need the following version of the coarea formula; see [Die72,(16.24 Lemma 3.10. Let π : X → Y be a submersion between two oriented manifolds of dimension n and m with n ≥ m. Let α and β be two differential forms on X and Y of degree n − m and m. Then where π −1 (y) is endowed with the orientation induced by π from the orientations of X and Y .
In particular, for n = m and α = 1, we have 3.5. Proof of the Blaschke formula. We can now proceed to the proof of Blaschke's formula (3.1).
Proof of Theorem 3.2. We will follow the proof given in [ÁB06,Theorem 5.2] under the extra assumption that the space of oriented geodesics on M is a manifold. The Blaschke formula (3.1) for a non-cooriented hypersurface N can be deduced from the co-oriented version (3.2) by taking the co-oriented double cover of N . Therefore it is sufficient to prove the latter formula. Furthermore, every immersed hypersurface can be decomposed into a disjoint union of embedded hypersurfaces up to a negligible set. Therefore it is sufficient to prove (3.2) for a co-oriented embedded hypersurface N .
By definition of the Holmes-Thompson volume, see (2.3), we have where the second equality follows from Lemma 3.9. Now, apply Proposition 3.7.
(2) with H = C * N ⊆ U * M . It follows that C * N ∩ U * Γ M has full measure in C * N . Thus, Consider the map π : C * N ∩ U * Γ M → Γ taking a unit momentum of M based at N pointing in a positive direction (with respect to the co-orientation of N ) to the traversing geodesic it generates. Apply the fiber integration formula (3.9) to this map with β = ω n−1 Γ . This yields the relation where #(γ ∩ + N ) is the number of times that γ crosses N transversely in the positive sense (as determined by the co-orientation of N ). Taking into account the definition of µ Γ by equation (3.8), Blaschke's formula follows.
3.6. Proof of the Santaló formula. We will need the following proposition expressing the Holmes-Thompson volume of a manifold as an integral over the bundle of dual unit spheres (instead of dual unit balls).
Proposition 3.11. The Holmes-Thompson volume of a Finsler n-manifold M is equal to Proof. We may assume that M is a compact manifold with corners. (If M is not compact, we can triangulate it and apply the proposition on each n-simplex to infer that it holds on the whole manifold.) where ∂M is considered as a piecewise-smooth (n − 1)-manifold (and we may restrict the integral to its smooth part).
To finish the proof, we shall show that the (2n Note that the one-form α M vanishes on the vertical space T * x M and the two-form ω M vanishes at bi-vectors formed of two horizontal or two vertical vectors. This follows from the coordinate expression (2.2). The (2n − 1)-form α M ∧ ω n−1 M evaluated at (u 1 , . . . , u 2n−1 ), where n − 1 vectors u i are horizontal and n vectors u i are vertical, can be written as a sum of terms of the form where σ is a permutation. If u σ(1) is vertical then the factor α M (u σ(1) ) is equal to zero. If u σ(1) is horizontal then there are only n − 2 horizontal vectors (and n vertical ones) among the remaining vectors, which implies that one of the factors ω M (u σ(2k) , u σ(2k+1) ) has two horizontal vectors and therefore vanishes. In both cases, the term (3.10) vanishes.
Proof of Theorem 3.4. Recall that ω M = π * Γ ω Γ , see Definition 3.6, and that U * Γ M has full measure in U * M , see Proposition 3.7.(1). By Proposition 3.11 we have vol n (D) = 1 By Lemma 3.10, integrating along the fibers of the submersion π : Hence,

Discretization of Finsler surfaces
The goal of this section is to describe a discretization of Finsler disks with minimizing interior geodesics into simple discrete metric disks. For this, we adapt the general approach of discretization developed in [Cos18] in relation with the filling area conjecture. The main novelty is that, in our case, the discrete geometry is described by a system of curves (wall system) made of geodesics.
First, we need to fix some notation regarding intersections of maps.
Definition 4.1. The intersections of a map f : X → Y with a map f : X → Y lying in a subset A ⊆ Y are the ordered pairs in the set The number of intersections between f and f is defined as where #S denotes the cardinality of a set S.
Similarly, the self-intersections of a map f : X → Y lying in a subset A ⊆ Y are the unordered pairs in the set and the multiplicity of a point y ∈ Y as a self-intersection of f is the number #I {y} (f ). A self-intersection is simple if it has multiplicity 1.
Let us introduce the notion of wall system on a disk; see [Cos18].
Definition 4.2. A (smooth) wall system on a surface M is a 1-dimensional (smooth) immersed submanifold W satisfying the following conditions: (1) the immersion map is proper (that is, the preimage of any compact subset of M is compact); (2) W is transverse to the boundary ∂M and satisfies ∂W = W ∩ ∂M ; (3) W is self-transverse and has only simple self-intersections; (4) no self-intersections of W lie on the boundary ∂M .
As a technical remark, we note that the symbol W denotes the immersion map, not its image Im(W) ⊆ M , nor its domain. The domain is a 1-manifold, i.e., a disjoint union of countably many intervals and circles. Hence the expression ∂W ⊆ ∂M involves an abuse of notation and actually means Im(∂W) ⊆ ∂M , where ∂W is the restriction of the map W to the boundary of the domain of W. The image of W will also be denoted W. Thus, the expression M \ W denotes M \ Im(W).
Eventually we will need to relax the definition by dropping condition (4). In this case, we say that W is a quasi wall system on M .
The curves that form a (quasi) wall system are called its walls. Note that if the surface M is compact, then W consists of finitely many compact walls; each of these walls is either a loop that avoids the boundary or an arc that meets the boundary only at its two endpoints. A quasi wall system W on a disk D is simple if its walls are arcs that have no self-intersections and that meet each other at most once. 2 In this paper, every quasi wall system W is smooth unless we make it clear that it is piecewise smooth. In that case, the non-smooth points of W may not coincide with the self-intersection points of W. Note that a piecewise smooth quasi wall system can be turned into a smooth quasi wall system by an isotopic deformation.
Example 4.3. Let D be the unit disk in the Euclidean plane. A wall system made of the horizontal and vertical diameters of D has area 1. A quasi wall system made of the three sides of an inscribed triangle of D has area 3 2 . We will also need the following definitions regarding the geometry induced by a quasi wall system. Definition 4.4. Every quasi wall system W on a compact surface M determines a discrete length for curves c in M . That is, the length of a curve is the number of times it intersects the quasi wall system (counted with multiplicity). Every quasi wall system W also induces a pseudo-distance on M \ W defined by where the infimum is taken over all paths of M joining x to y. We will refer to the pseudo-distance d W on M as the discrete distance induced by W on D.
The discrete area of (M, W) is the number of self-crossings of W contained in the interior of M plus half the number of self-crossings on the boundary. That is, When the quasi wall system is simple, the curves of W have no self-intersections and the second sum vanishes.
We will need the following result describing the intersection of two distancerealizing arcs of M . Recall that Γ is the space of traversing geodesics of M (i.e., geodesic arcs of M which do not intersect ∂M except at their endpoints, where they meet the boundary transversely). Proof. By Theorem 12.1, the distance-realizing arc [x, y] is C 1 . Suppose that the arcs γ and [x, y] are tangent, either at an interior point of M or at an endpoint of γ in ∂M . In both cases, this implies that [x, y] contains γ since the distance-realizing arc [x, y] follows the geodesic flow in the interior of M and the endpoints x, y do not lie in γ. Now, since the interior geodesic γ is transverse to ∂M at its endpointsx andȳ, the distance-realizing arc [x, y] is not differentiable atx andȳ. In particular, it is not C 1 , which is absurd. Therefore, the arcs γ and [x, y] may only have transverse intersections.
Suppose that the arcs γ and [x, y] intersect at least twice, say at a and b (with a and b different from x and y). Since both arcs are distance-realizing curves, the subarcs [a, b] ⊆ [x, y] and γ ab ⊆ γ joining a and b have the same length. Construct an arc α joining x and y by replacing the subarc [a, b] of [x, y] with the arc γ ab of the same length. By construction, the arc α is a distance-realizing curve. But since the intersection between γ and [x, y] is transverse, the arc α is not differentiable at a and b. In particular, it is not C 1 , which is absurd. Therefore, the arcs γ and [x, y] intersect at most once, and so exactly once if γ separates x and y.
Suppose now that γ does not separate x and y. Then the arc [x, y] does not intersect γ. Otherwise, it would go from one side of γ to the other (recall that γ and [x, y] have transverse intersection) and, because x and y are on the same side of γ, it would have to cross γ a second time, which is excluded. Therefore, the arcs γ and [x, y] do no intersect if γ does not separate x and y.
Let us compare the shortest paths for Finsler metrics and discrete metrics.
Definition 4.6. A quasi wall system is geodesic if its walls are geodesics.
Proposition 4.7. Let M be a self-reverse Finsler metric disk with minimizing interior geodesics, and let W be a geodesic quasi wall system on M . Then, every distance-realizing arc [x, y] of M with endpoints x, y not lying in W is also length minimizing with respect to W. Thus, for every x, y ∈ M \ W, we have Proof. The quasi wall system W is made of finitely many geodesics γ i that are transverse to ∂M . By Lemma 4.5, the arc [x, y] crosses only those geodesics γ i that separate x from y, exactly once. Therefore, no curve from x to y can be shorter than [x, y] with respect to W.
Before proceeding we derive a useful consequence of the last lemma.
Lemma 4.8. Let M be a self-reverse Finsler metric disk with minimizing interior geodesics. Then d(x, y) ≤ 1 2 length(∂M ) for any pair of points x, y ∈ M . The same inequality holds if the distance and length are taken with respect to a geodesic quasi wall system W, that is, Proof. Join the points x, y ∈ M by a distance-realizing arc [x, y]. By Lemma 4.5, each traversing geodesic γ of M intersects [x, y] at most once and meets ∂M exactly twice. Then the inequality d(x, y) ≤ 1 2 length(∂M ) follows from Blaschke's formula (3.4) applied to [x, y].
The claim regarding the geodesic quasi wall system W is proved in a similar way. By Proposition 4.7, the distance-realizing arc [x, y] is also length-minimizing with respect to W. Since each wall of W crosses [x, y] at most once and meets ∂M exactly twice, we derive the desired second inequality from the definition of length W ; see (4.1).
Simple wall systems can be used to discretize Finsler disks M with minimizing interior geodesics.
For every a, b ∈ R and every ε > 0, we write a b ± ε if |a − b| < ε.
Theorem 4.9. Let (M, F ) be a self-reverse Finsler metric disk with minimizing interior geodesics. Then, for every ε > 0 and every integer n large enough, there exists a wall system W, made of n geodesics of M , such that for every x, y ∈ M \ W, we have where L = length F (∂M ). Furthermore, the wall system W is necessarily simple.
Note that [Cos18, Theorem 7.1] states the existence of a theorem with similar approximation properties but not necessarily made of geodesics.
Proof. The wall system W will be made of random geodesics. Recall that Γ is the space of traversing geodesics of M (i.e., geodesic arcs of M which do not intersect ∂M except at their endpoints where they meet the boundary transversely) and has a natural measure µ Γ ; see (3.8). Furthermore, this space has finite total measure µ Γ (Γ) = 2L; see 3.3. Thus we may define on Γ the probability measure P = µ Γ 2L . Take n independent identically distributed (i.i.d.) random geodesics γ 1 , . . . , γ n of Γ with probability distribution P. Almost surely, these geodesics form a wall system W of M ; see Definition 4.2; because they are pairwise different and form only simple crossings located in the interior of M . Moreover, this wall system is simple, since the geodesics are minimizing and therefore they cannot cross each other more than once by Lemma 4.5. At this point, Theorem 4.9 follows from the next two lemmas.
The first lemma is obtained by applying the weak law of large numbers to the Blaschke formula (3.4) in a uniform way.
Lemma 4.10. With probability converging to 1 as n → ∞, the estimate Proof. Let D be a finite covering of M by smoothly bounded disks D with perimeter length F (∂D) < ε. Fix a basepoint p in each disk D ∈ D and denote by P the collection of all basepoints. Almost surely, the geodesics of W avoid the points of P and are transverse to the boundaries of the disks D ∈ D.
The following claim shows that the conclusion of the lemma holds in some finite cases.
Claim 4.11. The following assertions hold with probability converging to 1 as n → ∞.
(1) For every pair of points p, q ∈ P , we have (4.6) (2) For every disk D ∈ D and every pair of points x, y ∈ D \ W, we have Proof.
(1) Recall that the distance-realizing arc [p, q] is C 1 embedded in M ; see Theorem 12.1.
The intersection function f = f p,q : Γ → N defined by is a nonnegative measurable function. By Blaschke's formula (3.4), the random variables with probability converging to 1 as n → ∞. By Proposition 4.7, we have hence (1) follows.
(2) The proof of the second assertion is similar. For a disk D ∈ D, the intersection function f (γ) = #(γ ∩ ∂D) has expected value 2 L length F (∂D) by Blaschke's formula (3.4). Applying the weak law of large numbers to the random variables X i = f (γ i ) as previously, we derive with probability converging to 1 as n → ∞. Thus, Since D is a disk with minimizing interior geodesics, the discrete part of Lemma 4.8 yields (2).
Without loss of generality, we can assume that the conclusion of the previous claim is satisfied. Let x, y ∈ M \ W. The points x and y lie in some disks D x and D y of D. Denote by p x and p y the basepoints of D x and D y .
Since D x is a disk with minimizing interior geodesics, by Lemma 4.8 we have thus by the triangle inequality, we obtain Combining the triangle inequality with (4.7), we obtain Thus, the following equalities hold up to additive constants which are universal multiples of ε (namely, 2 L + 1 ε for the first one, ε for the second and 2 L ε for the third one). Therefore, Hence the first lemma. The second lemma is obtained by applying a (slightly generalized) weak law of large numbers to the Santaló+Blaschke formula (3.6).
Lemma 4.12. With probability converging to 1 as n → ∞, we have 2 n 2 − n area(M, W) 2π L 2 area(M ) ± ε. Proof. The intersection counting function f : Γ × Γ → N defined by is a measurable function that takes value 0 or 1 almost surely. The n(n−1) 2 random variables X i,j = f (γ i , γ j ) with i < j are identically distributed but not completely independent. In fact X i,j is independent of X k,l if and only if {i, j} ∩ {k, l} = ∅. To apply the generalized weak law of large numbers, Theorem 4.13 below, we must check that the variables X i,j are sufficiently independent. There are n(n−1) 2 ∼ n 2 variables X i,j , which yield ∼ n 4 pairs (X i,j , X k,l ), of which only ∼ n 3 are not independent. Therefore the proportion of nonindependent pairs p ∼ n 3 n 4 ∼ 1 n goes to zero as n → ∞. Thus, by Theorem 4.13, the average value of the variables X i,j , converges in probability to the expected value, which, by the Santaló+Blaschke formula (3.6), is equal to This concludes the proof of Theorem 4.9.
Let us prove the following generalization of the weak law of large numbers.
Theorem 4.13 (Weak law of large numbers for identically distributed, mostly independent random variables). Fix a real valued random variable X with finite expected absolute value E(|X|) < ∞ and an integer n > 0. Then the average X = 1 n i X i of n random variables X i , each with the same distribution as X, is near the expected value E(X) with probability arbitrarily close to 1 if the proportion of nonindependent pairs p = #{(i, j) | X i and X j are not independent} n 2 is small. More precisely, for every ε, δ > 0, there Remark 4.14. Note that we do not explicitly require n to be large, but this is generally necessary for p to be small, because each variable X i is in general correlated with itself, 3 which implies that p ≥ n n 2 = 1 n . If these are the only correlations and n goes to infinity, then p = 1 n → 0 and therefore X converges to E(X) in probability. In this way, we recover the usual weak law of large numbers.
Proof. The proof is similar to the standard proof of the weak law of large numbers; see [Tao09, Theorem 1.5.1] for instance. It proceeds by cases; only the first one requires attention to the non-independent pairs.
Case E(X 2 ) < ∞ and E(X) = 0. Fix ε > 0. We have to prove that the probability of deviation P |X| > ε gets arbitrarily low if p is sufficiently small. To apply Chebyshev's inequality, we compute Here we used the Cauchy-Schwartz inequality E(X i X j ) ≤ E(X 2 ) and the fact that E(X i X j ) = E(X i ) E(X j ) = E(X) 2 = 0 if X i and X j are independent. Applying Chebyshev's inequality, we obtain as we had to prove.
3 A random variable is independent of itself if and only if its probability distribution is concentrated in one value.
Case E(X 2 ) < ∞. This case follows from the previous one applied to the random variable Y = X − E(X), which satisfies E(Y 2 ) < ∞ and E(Y ) = 0.
General case E(|X|) < ∞. This case, which is not needed in this article, follows from a truncation argument as in the usual proof of the weak law of large numbers, given for instance in [Tao09].
We proceed to the details. It is sufficient to show that if p is small enough with respect to ε, δ and X. We may assume δ ≤ 1. We proceed as follows. For any cutoff value M ≥ 0, we decompose the random variable X as a sum of a bounded part and a tail where the bounded part is and the tail is In the same way we decompose the variables X i = X <M i + X ≥M A key fact about the decomposition (4.11) is that the expected absolute value E X ≥M of the tail part gets arbitrarily small if M is sufficiently large. This follows from the pointwise convergence |X ≥M | → 0 as M → +∞, which is dominated by |X|, or from the formula where P |X| is the probability distribution of |X| on R. We choose M so that (4.12) This implies that the average X ≥M of the tail parts also has small expected absolute value By Markov's inequality, this implies that X ≥M is small in absolute value with high probability (4.13) Now, the bounded part X <M has finite second moment E((X <M ) 2 ) < ∞. Therefore, we may apply the previous case of the theorem, which yields Here we used (4.12) and the assumption δ ≤ 1. Combining this with (4.14) and (4.13) by the triangle inequality, the result (4.10) follows.

Minimal area of disks: from discrete to Finsler metrics
The goal of this section is to state a discrete version of the area lower bound on Finsler disks with minimizing interior geodesics and to show how to derive the area lower bound for Finsler metrics from its discrete version.
Let us recall the area lower bound for Finsler metrics we want to prove. In order to state the discrete version of this result, we need to introduce the notion of simple discrete metric disks.
Definition 5.2. A topological disk D with a quasi wall system W is a simple discrete metric disk of radius r centered at an interior point O ∈ D\W if the quasi wall system W is simple (see Definition 4.2), all the points of D \ W are at d W -distance at most r from O and all the points of ∂D \ W are at distance exactly r from O.
It is essential here to allow W to be a quasi wall system rather than a wall system. Indeed, all points of W located on ∂D necessarily have multiplicity 2.
The following result, which will be proved in the subsequent sections, can be seen as a discrete version of Theorem 5.1. Furthermore, the equality is attained.
Assuming this discrete area lower bound, we can derive Theorem 5.1 as follows. Dividing by n 2 , using (4.4) and (4.5), and letting ε go to zero, we obtain Hence, area(M ) ≥ 6 π r 2 . Sections 6-9 are devoted to the proof of Theorem 5.3.

Quasi wall systems and interval families
In this section, we show how to encode a simple discrete disk as a 1dimensional object.
We start by proving the following basic fact about simple discrete metrics.
Proposition 6.1. Let D be a disk with a simple quasi wall system W. Then d W (x, y) = number of walls of W that separate x from y. (6.1) for any two points x, y ∈ D \ W.
Note that if D is a Finsler disk with minimizing interior geodesics and W is geodesic, then this proposition follows from Proposition 4.7.
Proof. It is clear that d W (x, y) ≥ number of walls of W that separate x from y.
To prove the reverse inequality we will show the following.
Claim 6.2. There exists a smooth path α from x to y that is in general position with respect to W ∪ ∂D and crosses each wall of W at most once.
Here, we say that a smooth curve α is in general position with respect to an immersed 1-submanifold N if it is regular, transverse to N and avoids the self-intersections of N . If α is piecewise smooth, we require in addition that none of its non-smooth points lie in N .
The claim is a version of Levi's extension (or enlargement) lemma for pseudoline arrangements. This version concerns arrangements on a disk, rather than on the projective plane as in the more standard version of the lemma (found e.g. in [FG17, Thm. 5.1.1]).
We prove the claim by induction on the number of walls. Suppose the claim is valid for any quasi wall system W made of n walls. Consider a simple quasi wall system W obtained by adding an extra w to W. By inductive hypothesis, there is a smooth path α that satisfies all the conditions of the claim with respect to W. By perturbing α, we ensure that it is transverse to w as well. If α crosses w at most once, then we are done. Otherwise, let x and y be the first and last points of α where α crosses w . Note that they are generic points of w : they are neither on W, nor on ∂D. Replace the segment of α from x to y by the segment [x , y ] of w , and let α be the resulting curve. We claim that α is a piecewise smooth curve, in general position with respect to W, that crosses each wall of W at most once. This is because the segment [x , y ] that we inserted only crosses the walls of W that separate x from y (since it is part of a wall of the simple quasi wall system W ), and these walls are necessarily crossed as well by the piece of α between x and y that we replaced. The next step is to perturb the curve α so that the segment [x , y ] is displaced sideways and away from w and the resulting curve α is in general position with respect to W ∪ ∂D and crosses W the same number of times as α does, and, in addition, is transverse to w and crosses w at most once. Thus, α is in general position with respect to W ∪∂D and crosses each wall of W at most once, but is non-smooth at two points. To make it smooth, we modify it near these two points. One consequence of these formulas is that the discrete area of (D, W) given by (4.2) may be computed from I.
The following result characterizes the relation between the quasi wall system W and the interval family I. Before stating this result, we need to introduce a definition. A point p of S 1 is generic with respect to a finite interval family I of S 1 if p is not an endpoint of any interval of I. Alternatively, the endpoints of the intervals of I are the non-generic points of S 1 .
Proposition 6.5. Let (D, W ) be a simple discrete disk of radius r centered at O. The family I = I W of intervals of S 1 has the following properties: (1) no pair of intervals of I cover S 1 ; (2) every generic point of S 1 is contained in exactly r intervals of I; (3) every non-generic point of S 1 is an endpoint of exactly two, adjacent intervals of I. Moreover, if a finite family I of intervals of S 1 satisfies the conditions (1)-(3), then I = I W for some quasi wall system W that makes D a simple discrete metric disk of radius r and center O. For instance, one may let W be the unique standard quasi wall system homotopic to I on D \ {O}.

Proof.
(1) If two intervals α, β ∈ I cover S 1 , then the corresponding walls α, β of W would form a bigon containing the point O, which implies they cross twice, contradicting the hypothesis that W is simple.
(2) Consider a generic point p ∈ S 1 . Since W is a simple quasi wall system on D, the distance between any pair of points of D is the number of walls that separate them; see Proposition 6.1. On the other hand, the walls that separate O from p are the walls that cover p. Hence the result.
(3) This follows from the previous property: if p ∈ S 1 is the endpoint of some interval α ∈ I, it must also be the startpoint of some other interval so that every generic point near p is contained in the same number r of intervals of I. This means that p is the endpoint of two walls, and it cannot be the endpoint of more walls because W can only have simple self-intersections on ∂D since it is a quasi wall system; see Definition 4.2. Now, let I be a finite family of intervals of S 1 satisfying conditions (1)-(3), and let W be the unique standard quasi wall system homotopic to I on D \ {O}. Clearly, W is a quasi wall system, and it is simple because it is made of arcs that intersect each other at most once. Also, every point p ∈ D \ W is at distance at most r from O, and exactly r if p ∈ ∂D. (A shortest path is the vertical ray from p to O.) This shows that (D, W) is a simple discrete disk of radius r centered at O.

Inadmissible configurations in a minimal simple disk
In this section, we rule out some intersection patterns for an extremal quasi wall system on a disk.
Consider a quasi wall system W on D defining a simple discrete metric disk of radius r with minimal discrete area. By Proposition (6.5), we can assume that W is formed of standard arcs; see Definition 6.4.
Lemma 7.1. No arc of W covers two (possibly adjacent) intersecting arcs of W.
Proof. By contradiction, suppose that an arc γ of W covers two intersecting arcs α = ac and β = bd of W. Switching the roles of the two arcs if necessary, we may assume that the points a, b, c, d appear in that order in the interval γ (with possibly b = c). See Figure 2. Let W be the collection of curves obtained from W by replacing α and β with the standard arcs α = ad and β = bc (with no β if b = c). See Figure 2. Note that, like W, the immersed 1-submanifold W is a quasi wall system on D. Moreover, we claim that W also makes D a simple discrete metric disk of radius r centered at O. This is because none of the properties (1)-(3) of Proposition 6.5 is affected by the replacement. For instance, there is no arc δ of W such that the intervals δ and α cover the boundary ∂D, because in that case δ and γ would also cover ∂D, however the arcs δ and γ are already present in W, contradicting by Proposition 6.5 the fact that W is simple. Also, the fact that every generic point of ∂D is covered by exactly r arcs of the quasi wall system is clearly maintained, as well as the fact that each non-generic boundary point is the common endpoint of two adjacent walls.
Let us show that the area of (D, W ) is less than the area of (D, W) by comparing the number of self intersections of the quasi wall systems W and W according to the discrete area formula (4.2). First, observe that every pair of arcs of W different from α and β belongs to W . Therefore, these pairs of arcs give the same contribution to the discrete areas of W and W . Let δ = pq be an arc of W different from α and β. By considering cases regarding the location of the endpoints p and q with respect to the points a, b, c and d, we see that  Proof. By contradiction, suppose that an arc γ of W intersects two adjacent arcs α and β of W. We choose γ so that it is minimal with respect to the covering relation, among arcs that intersects α and β (i.e., no arc of W covered by γ intersects α and β). Denote by a, b, c, d, e the endpoints of the three arcs, in the order in which they are found on the interval α ∪ β. Thus, α = ac, β = ce and γ = bd, and no arc of W that covers c is covered by γ (other than γ itself). See Figure 3. Let c − and c + be two points of ∂D close to c such that [c − , c + ] ∩ ∂W = {c}. Let W be the collection of curves obtained from W by replacing the three arcs α, β and γ with the four arcs α = ac + , β = c − e, γ − = bc − and γ + = c + d. See Figure 3.
Note that W is a quasi wall system on the disk D. In fact, W makes D a simple discrete disk of radius r centered at O. To see this we argue as in the proof of Lemma 7.1. By Proposition 6.5, it is enough to check that the family I = I W of boundary segments δ corresponding to the walls δ of W satisfies the properties (1)-(3) of Proposition 6.5. To check Property (2) (that each generic point of ∂D is covered r times by the walls of W ) note that both α∪β ∪γ and α ∪β ∪γ − ∪γ + cover twice the generic points of [b, d] and once the remaining generic points of [a, e]. Property (3) regarding nongeneric boundary points is also maintained, with the wall endpoint c replaced by the two points c − and c + . Finally, to check the property (1), suppose δ and ε are two arcs of W that cover the whole boundary ∂D. It is impossible that both δ and ε are among the new arcs α , β and γ ± because that would mean that α and β already cover ∂D, contradicting the fact that W is simple. Similarly, the arcs δ and ε cannot be both among the unchanged arcs (those in W ∩W ) either, otherwise W would not be simple. Therefore, δ is one of the unchanged arcs and ε is one of the new arcs α , β , γ ± . In the case ε = α , we see that δ and α cannot cover ∂D since this would imply that δ and α already cover ∂D. This is because α \ α is contained in the interval [c − , c + ] which contains no endpoints of δ since [c − , c + ] ∩ W = {c}. The case ε = β is analogous and the cases ε = γ ± are easier to rule out since the arcs γ ± are covered by γ. We conclude that the property (1) is satisfied, thus (D, W ) is a simple discrete metric disk of radius r.  Let us show that the area of (D, W ) is less than the area of (D, W). Again, we use the discrete area formula (4.2), which says where the sum runs over pairs {δ, ε} of different walls of W. The pairs {δ, ε} of walls that are contained in W ∩ W make the same contribution to area(D, W) and to area(D, W ). To evaluate the contribution of pairs {δ, ε} with δ ∈ W ∩ W and ε ∈ W ∩ W , we note that any arc δ = pq with no endpoints in [c − , c + ] satisfies . This is seen by considering case by case the possible locations of p and q with respect to a, b, c, d e. The equality holds for all arcs δ = pq ∈ W ∩ W , because the exceptional case p ∈ [b, c] and q ∈ [c, d] is excluded by how γ was chosen: the arc γ = cd covers no other arc δ = pq of W that in turn covers c. Finally, to compute the contribution of the pairs {δ, ε} where none of the two arcs δ and ε is in W ∩ W , we note that We conclude that area(D, W ) = area(D, W) − 1 2 , contradicting the minimality of W.

Pairs of adjacent arcs
In this section we show that the sequences of adjacent arcs in an extremal quasi wall system on a disk have a periodic structure.
Consider a quasi wall system W on the disk D, made of standard arcs, defining a simple discrete metric disk of radius r centered at O with minimal discrete area as in Section 7. Recall that the upper half plane H = R×[0, ∞) is the universal cover of the cylinder C = S 1 × [0, ∞) = D \ {O}. We identify its boundary ∂H with the real line R. Let W H be the quasi wall system on H formed of all the lifts of the arcs of W.
Since D is a disk of radius r, it follows that every generic point of ∂H is covered by exactly r arcs of W H . To ensure this uniform coverage, each endpoint of an arc must be the startpoint of another arc, and thus each arc of W H belongs to a bi-infinite sequence of consecutive arcs, called a "strand" of W H .
Definition 8.1. A strand of W H is a bi-infinite sequence (α i ) i∈Z of consecutive arcs of W H of the form The points a i where the strand (α i ) i meets the boundary ∂H are called the stops of the strand. The width of an arc α i is the number a i+1 − a i .
Since each strand of arcs covers the generic points of ∂H once, it follows that the quasi wall system W H is composed of exactly r strands.
The following result describes how each strand intersects a pair of adjacent arcs of W H .
Lemma 8.2. Let α 0 = a 0 a 1 and α 1 = a 1 a 2 be two adjacent arcs of W H . Then every strand of W H has exactly one arc with endpoints on the boundary interval I = [a 0 , a 2 ). This arc is covered by α 0 or by α 1 .
Proof. The strand that contains the arcs α 0 and α 1 clearly satisfies the proposition. Thus let (β i ) i∈Z be any other strand of W H , numbered so that the arc β 0 covers the point a 1 . This strand has a stop in I, otherwise β 0 would cover the two adjacent arcs α 0 and α 1 , in contradiction with Lemma 7.1. Also, the strand (β i ) i cannot have stops in both intervals [a 0 , a 1 )  and [a 1 , a 2 ), otherwise the arc β 0 would intersect the two adjacent arcs α 0 and α 1 , in contradiction with Lemma 7.2. Thus the strand (β i ) i has stops in exactly one of the intervals [a 0 , a 1 ) and [a 1 , a 2 ), say, the second one; see Figure 4. Furthermore, it cannot have just one stop in this interval, otherwise the two adjacent arcs β 0 , β 1 that share this stop would intersect α 1 , in contradiction with Lemma 7.2. Also, it cannot have three stops in the interval, otherwise the adjacent arcs β 1 and β 2 would be covered by α 1 , in contradiction with Lemma 7.1. We conclude that the strand (β i ) i has exactly two stops (and therefore one arc) in the interval [a 0 , a 2 ), and both of these stops are covered by one of the arcs α 0 or α 1 . See Figure 4. Let n be the number of walls of the quasi wall system W on the disk D. From now on, changing the parameterization of the boundary circle S 1 = ∂D, we assume that S 1 is a circle of length n, thus S 1 = R/nZ, and that the endpoints of the walls of W are located at the semi integer points. (This implies that the distance between two adjacent integer points is equal to 1.) Therefore, on the universal cover of the cylinder C = D \ {O}, which is the upper half plane H, we have ∂W H = Z + 1 2 ⊆ R = ∂H. Note that the quasi wall system W H is periodic of period n (where n is the number of walls of W) in the sense that it is invariant by the horizontal translation of length n. However, the following result implies that W H is also periodic with period 2r, where r is the number of strands of W H ; see Definition 8.1.
Lemma 8.3. The sum of the widths of two adjacent arcs α 0 , α 1 of W H is equal to 2r.
Proof. Consider two adjacent arcs α 0 = a 0 a 1 and α 1 = a 1 a 2 as in Lemma 8.2. According to that lemma, each of the r strands of W H has exactly two stops in the interval [a 0 , a 2 ). Therefore there are 2r semi-integers in that interval. It follows that a 2 − a 0 = 2r.
Denote by S [t,t+2r) = [t, t + 2r) × [0, +∞) a strip of width 2r of the halfplane H. The following result describes the arcs of the quasi wall system W H that are contained in such a strip.
(1) Each strip S [t,t+2r) contains exactly one arc of each strand (and each of these arcs determines its strand completely).
(2) The r arcs contained in a strip S [t,t+2r) do not intersect each other.
(3) Any pair of strands intersects each other exactly twice in the strip S [t,t+2r) . Proof.
(1) Consider a strand (α i ) i∈Z , with α i = a i a i+1 . According to Lemma 8.3, we have the equation a i+2 = a i +2r for all i. This implies that the strip S [t,t+2r) contains exactly two stops and thus exactly one arc of the strand (α i ) i . The same equation implies that two consecutive stops determine the strand.
(2) Consider a second strand (β j ) j∈Z , with β j = b j b j+1 . Assuming that two arcs α 0 and β 0 of W H intersect, we want to show that they are not contained in a strip S [t,t+2r) . We may assume without loss of generality that a 0 < b 0 , therefore b 0 ∈ (a 0 , a 1 ). Since the strand (β j ) j has a stop in the interval [a 0 , a 1 ) by Lemma 8.2 it cannot have a stop in [a 1 , a 2 ). It follows that b 1 > a 2 = a 0 + 2r, hence the arcs α 0 = a 0 , a 1 and β 0 = b 0 , b 1 are not contained in a strip of width 2r.
(3) Consider two strands (α i ) i∈Z and (β j ) j∈Z as above. Since a i+2 = a i + 2 as shown in (1), the stand (α i ) i is invariant by the horizontal translation of displacement 2r. The same holds with (β j ) j . We want to show that they cross exactly twice in a strip S [t,t+2r) . By invariance under the horizontal translation of length 2r, we may choose t arbitrarily. For instance, we can choose t = a 0 . By Lemma 8.2, the strand (β j ) j has stops in exactly one of the intervals (a 0 , a 1 ) and (a 1 , a 2 ). Thus, it intersects (twice) exactly one of the arcs α 0 = a 0 a 1 , α 1 = a 1 a 2 .
We also note the following.
Lemma 8.5. In the quasi wall system W H , there is an arc of width 1.
Proof. Let α 0 = a 0 a 1 be an arc that is minimal with respect to covering (i.e., α 0 does not cover any arc of W H ). We want to show that a 1 − a 0 = 1. By Lemma 8.2, each strand other than the one generated by α 0 has two stops in the interval (a 0 , a 2 ), both contained either in (a 0 , a 1 ) or in (a 1 , a 2 ). Thus, if the interval (a 0 , a 1 ) has any stop, it has in fact two stops of a strand, and therefore there is an arc of W H covered by α 0 . However, this possibility is excluded by the minimality of α 0 . Therefore, the interval (a 0 , a 1 ) has no stops and hence its endpoints a 0 and a 1 are consecutive semi-integers.

Proof of the discrete area lower bound
We can now proceed to the proof of the discrete area lower bound for simple discrete metric disks, see Theorem 5.3, making use of the previous notations and constructions. Namely, let us prove the following.
Theorem 9.1. The discrete area of every simple discrete metric disk of radius r is at least 3 2 r 2 . Proof. Let (D, W ) be a simple discrete metric disk of radius r and center O that has minimal area. Recall that the punctured disk D \ {O} is identified with the flat cylinder C = S 1 × [0, +∞). As shown in Section 6, W is homotopic in C to a quasi wall system W made of standard arcs, such that (D, W) is also a discrete disk of radius r centered at O and has the same area as (D, W ). Thus we must show that area(D, W) ≥ 3 2 r 2 . Also, we may assume that the lift of W to the universal cover H = R × [0, +∞) is a quasi wall system W H such that ∂W H = Z + 1 2 ⊆ R = ∂H as in Section 8. Let t ∈ R be a generic number. By Proposition 8.4, the weighted number of self-intersections of the quasi wall system W H that lie in the strip S [t,t+2r) is The first term counts the crossings between the different strands: each pair of strands crosses twice, and the crossings are located in the interior of the half-plane H. The second term counts, with weight 1 2 , the intersections that lie in the boundary ∂H; these are the intersections between adjacent arcs, that belong to the same strand. Thus, the discrete area of the disk (D, W) is where n is the number of walls of W.
To finish we will show that n ≥ 3r. Let (α i = a i a i+1 ) i∈Z be a strand of W H such that a 0 − a −1 = 1. Such a strand exists by Lemma 8.5. Moreover, we may assume that a 0 = 1 2 and a −1 = − 1 2 . The interval (a 0 , a 1 ) has width 2r−1 (by Lemma 8.3) and contains 2r − 2 semi-integers.
Each of these semi-integers is either the startpoint or the endpoint of one of the r − 1 arcs that are covered by α 0 ; see Proposition 8.4. Let b 0 be the rightmost of the r − 1 startpoints. Note that b 0 ≥ a 0 + (r − 1).
(9.2) This point b 0 is a stop of a strand (β j = b j b j+1 ) j∈Z . The arc β 0 is covered by α 0 and the arc β 1 = b 1 b 2 intersects the arc α 0 . The arcs α 0 and β 1 cannot extend over a whole fundamental domain S [t,t+n) of the universal cover, by the property (1) of Proposition 6.5. Therefore, n > b 2 − a 0 . On the other hand, by Lemma 8.3 and the inequality (9.2), we have We conclude that n > 3r−1, or, equivalently, n ≥ 3r, as we had to prove.

Simple discrete metric disks of minimal area
In this section, we analyze the equality case of Theorem 9.1.
Proposition 10.1. For every positive integer r, there is a simple discrete metric disk of radius r and area 3 2 r 2 . It is unique up to isotopy of the disk with the center fixed.
Proof. Recall the proof of Theorem 9.1. Let W be a simple quasi wall system such that (D, W ) is a simple discrete metric disk of radius r with minimal discrete area. Consider the simple quasi wall system W homotopic to W made of standard arcs. To attain the lower bound on area(D, W) and so on area(D, W ), we must have n = 3r, therefore the inequality (9.2) must be an equality. This implies that, for the r − 1 arcs covered by α 0 , the r −1 startpoints must precede the r −1 endpoints in the interval (a 0 , a 1 ). In consequence, these r − 1 arcs together with the arc α 0 form a chain with respect to the covering relation; see Figure 5. This implies that the r arcs are completely determined, and by Proposition 8.4, so are the quasi wall systems W H and W, which are made of standard arcs. Thus, the quasi wall system W H contains all arcs of the form kr − s, kr + s with k integer and s ∈ (0, r) semi integer; see Figure 5. Similarly, the quasi wall system W is obtained from W H by taking the quotient of H under the horizontal translation of length 3r; see Figure 6. This proves the uniqueness of the simple discrete metric disk of minimal area, but only up to homotopy of the quasi wall system. The uniqueness up to isotopy of the disk follows from the next result. Proof. We proceed by induction in the number n of walls. The case n = 1 is trivial. In general, we argue as follows.
Let γ be a wall of W that covers no other wall of W; see Definition 6.3. The curve γ divides the disk D into two topological closed disks A and B which intersect along γ, with O ∈ A. The part of W that lies in B consists of k ≥ 0 arcs going from γ to ∂B \ γ. These arcs are pairwise disjoint, otherwise they would form a triangle in B ⊆ D \ {O}. The part of W that lies in A, excluding γ, is a quasi wall system on A with n − 1 walls.
Let γ be the wall of W homotopic to γ in D \ {O}. We apply to W a first isotopy of D \{O} to ensure that γ = γ. The wall γ does not cover any other wall β of W , otherwise the wall β of W homotopic to β would cross γ twice. Similarly as in W, the part of W lying in B consists of k pairwise disjoint arcs going from γ to ∂B \ γ. Thus, by applying a second isotopy, we may ensure that W ∩B = W ∩B. Finally, we get (W \γ )∩A = (W \γ)∩A by applying an isotopy of the disk A fixing O, whose existence is guaranteed by the inductive hypothesis. O Figure 6. An area minimizing simple discrete disk (D, W) of radius r = 5 where the topological disk D is an hexagon and the quasi wall system W consists of straight lines.
Remark 10.3. The isotopy between W and W can also be derived from [GS97], where it is proved that two wall systems on a closed surface which are homotopic to each other and are both in minimally crossing position (i.e., they attain the minimum number of self-intersections possible in their homotopy class) can be obtained one from the other by isotopies and triangle flip moves (called "type III moves" in [GS97]). Strictly speaking, we first need to adapt this result to quasi wall systems on surfaces with boundary.
Since W and W do not form any triangle in D \ {0}, we conclude that they are isotopic in D.

Construction of almost minimizing Finsler disks
In this section, we construct a Finsler disk of radius r with minimizing interior geodesics whose area is arbitrarily close to the lower bound 6 π r 2 given by Theorem 1.2.
Let us first go over Busemann's construction of projective metrics in relation with Hilbert's fourth problem. We refer to [Bus76], [Pog79], [Ale78], [Sza86], [Pap14] and references therein for an account on the subject.
The space Γ of oriented lines in R 2 can be identified with S 1 × R. Under this identification, an oriented line γ is represented by a pair (e iθ , p) where e iθ is the direction of the oriented line γ and p = − − → OH × e iθ , − → e z is the signed distance from the origin O to γ. Here, H is a point of γ, the vector − → e z is the third vector in the canonical basis of R 3 , thus it is a unit vector orthogonal to R 2 , and "×" is the vector product in R 3 .
Definition 11.1. Let µ be a (nonnegative) Borel measure on Γ. Consider the following conditions: (1) the measure is invariant under the involution of Γ reversing the orientation of lines; (2) the measure of every compact subset of Γ is finite; (3) the set of all oriented lines passing through any given point of R 2 has measure zero; (4) the set of all oriented lines passing through any given line segment in R 2 has positive measure. A Borel measure µ satisfying (1)-(3) induces a length function defined for any curve α in the plane R 2 . For this kind of length function, straight segments are shortest paths, therefore the pseudo-distance associated to this length function is where Γ A denotes the set of lines γ ∈ Γ that intersect a subset or point A contained in the plane R 2 . The pseudo-distance d µ is projective, which means that d(x, z) = d(x, y) + d(y, z) for every x, y, z ∈ R 2 with y ∈ [x, z], and in fact every continuous projective distance is obtained from a unique measure µ; see [Ale78]. If µ also satisfies (4) then d µ is a projective distance (and vice-versa).
The projective distance induced by a Borel measure satisfying the conditions (1)-(4) is not Finsler in general. Borel measures inducing a Finsler metric can be characterized as follows; see [Álv05] for a presentation of this result due to Pogorelov [Pog79] and [ÁB10] for a generalization.
Theorem 11.2. Let µ be a Borel measure on Γ satisfying (1)-(4). The distance d µ is Finsler if and only if µ is a positive smooth measure. In this case, the smooth measure on Γ induced by the symplectic form associated to the Finsler metric, see (3.8), coincides with µ.
Here, a measure µ on Γ is (positive) smooth if it admits a (positive) smooth function h as density, that is, dµ = h dλ.
Remark 11.3. The geodesics of a plane with a projective Finsler metric d µ are the straight lines parametrized by µ-length. Therefore, a plane with a projective Finsler metric has minimizing geodesics.
We may define the area of a Borel set D in the plane with a measure µ on Γ satisfying (1)-(3) by the Santaló+Blaschke formula (3.6) In other terms, the area measure is the normalized pushforward measure where i : Γ × Γ \ ∆ Γ → RP 2 maps each ordered pair of different lines to its intersection point in the projective plane RP 2 ⊇ R 2 . (Note that the diagonal ∆ Γ has measure zero because µ has no atoms.) This area function coincides with Holmes-Thompson area if the metric is Finsler; see (3.6).
11.1. Construction of a non-Finsler extremal disk. Let us construct a non-Finsler projective pseudo-metric disk satisfying the equality case in Theorem 1.2. Consider the three pairs of one-parameter families L ± k of oriented lines in R 2 defined as and where k ∈ {0, 1, 2}; see Figure 7. Note that the lines L + k (t) and L − k (t) only differ by their orientation. We will sometimes denote these families of lines by L k when the orientation does not matter. Consider the (nonsmooth) Borel measure on Γ is the average of the push-forwards to Γ of the Lebesgue measure L on R + . Let D k be the line passing through O orthogonal to L k . Let D k ⊆ D k be the ray from O that intersects orthogonally every line L k (t). Denote by π k the orthogonal projection of R 2 to D k . By construction, the d ν k -pseudo-distance between two points x, y ∈ R 2 is equal to one quarter times the Euclidean length of the projection of [x, y] to D k lying in D k . That is, for every x, y ∈ R 2 . Furthermore, Observe also that the measure µ ext satisfies (1)-(3), but not (4). Thus, d µext is a projective pseudo-distance on R 2 . The disk D µext (r) of radius r for the pseudo-distance d µext with center the origin O of R 2 is the minimal regular hexagon containing the Euclidean disk of radius 4r, whose vertices are 2 √ 3 3 4re i k π 3 for k ∈ {0, . . . , 5}; see Figure 7. A direct computation using (11.1) shows that its area is 6 π r 2 . Thus, the disk D µext (r) is a non-Finsler projective pseudo-metric disk satisfying the equality case in Theorem 1.2. One can think of it as an extremal (degenerate) metric for the problem considered. Observe also that D µext (r) is not rotationally symmetric.
Remark 11.4. By identifiying all pairs of points at zero pseudo-distance, the pseudo-metric disk D µext (r) identifies with the closed ball D(r) of radius r centered at the tip of a cone composed of three copies of a quadrant of the 1 -plane glued together. It follows from a direct computation that the Holmes-Thompson area of the disk D(r) is equal to 6 π r 2 . Defined in this way, the metric on D(r) is non-Finsler (e.g., it has a singularity at the origin and the tangent norms are neither smooth nor strongly convex) but can still be thought of as an extremal (degenerate) metric. Note that the (pseudo)-metrics on D µext (r) and D(r) can be viewed as continuous versions of the extremal simple discrete disk; see Section 10. 11.2. Construction of a Finsler nearly extremal disk. In the rest of this section, we explain how to modify the pseudo-metric d µext so as to obtain a projective Finsler disk of radius r whose area is arbitrarily close to 6 π r 2 . First, the projective pseudo-metric d µext can be approximated by a projective metric by simply adding to µ a multiple ελ of the uniform measure λ (given by dλ = dθ dp) so that the point (4) is also satisfied; this changes d µext by adding ε times the Euclidean distance. This projective metric is not Finsler, but in turn it can be approximated by a Finsler metric; see [Pog79]. More generally, every projective distance d µ , where µ is a Borel measure satisfying (1)-(4), can be approximated by a projective Finsler distance on every compact set of R 2 . Thus, by Theorem 11.2, there exists a sequence µ n of positive smooth measures on Γ such that the corresponding sequence of Finsler distances d µn uniformly converges to d µext on every compact set of R 2 . This approximation result is obtained by a convolution argument on the distance function d µ . Although it is possible that the measures µ n weakly converge to µ ext , this issue is not addressed in [Pog79]. This leads us to consider a slightly different approach. Instead of regularizing the distance function, we smooth out the measure µ ext and show that the corresponding projective Finsler distance converges to d µext . This alternative approach to the regularization of a projective distance provides a weak convergence of measure by construction, which allows us to estimate areas as well as distances.
We proceed as follows. First, we truncate the measure µ ext by setting a bound for the absolute value of the p coordinate of the lines γ ∈ Γ. In this way, we obtain a probability measure µ 0 on Γ, without changing the corresponding distance function in a neighborhood of the origin. Similarly, we truncate the uniform measure λ to a probability measure λ 0 . This enables us to use standard theorems on weak convergence of probability measures.
Let us now describe the convolution process. For ε > 0, let h ε be a smooth nonnegative function on Γ = R/2πZ × R, with support in (−ε, ε) × (−ε, ε), such that Γ h ε (θ, p) dθ dp = 1. For each ε > 0, consider the positive smooth measure µ ε on Γ with density h ε * µ 0 , that is, where h ε * µ 0 is the smooth function on Γ defined by the convolution and λ is the standard product measure on Γ = R/2πZ × R, given by dλ = dθ dp. By [Bog18,§1.4.3], the smooth measure µ ε weakly converges to µ 0 as ε goes to zero. Define also the measure which also converges to µ 0 as ε → 0. By Theorem 11.2, the distance d µ + ε induced by µ + ε is a projective Finsler distance on a neighborhood of the origin in R 2 .
To approximate distances and areas, we have the following tools.
Lemma 11.5. Let µ and µ n be probability measures on Γ satisfying the conditions (1)-(3) of Definition 11.1. If µ n weakly converges to µ, then the distance d µn converges uniformly to d µ on every compact subset of R 2 .
Proof. Note first that the distance between two points x, y ∈ R 2 is where Γ [x,y] denotes the set of lines that intersect the segment [x, y]. Thus, for a specific pair of points x, y, the weak convergence µ n → µ implies that d µn (x, y) → d µ (x, y) by the portmanteau theorem [Bil99, Theorem 2.1], since Γ [x,y] is a continuity set for µ. That is, its boundary (where Γ z is the set of lines that contain a point z) has measure µ(∂Γ [x,y] ) = 0 since µ(Γ z ) = 0 for each point z ∈ R 2 by condition (3) on µ.
To show that this convergence holds uniformly for x, y in any given compact set K ⊆ R 2 , let A be the family of sets Γ [x,y] for x, y ∈ K. According to Theorems 2 and 3 from [BT67], to show uniform convergence µ n (A) → µ(A) for all sets A ∈ A, it is sufficient to show that µ(B δ (∂A)) → 0 uniformly as δ → 0, where B δ (S) denotes the δ-neighborhood of a set S ⊆ Γ (say, with respect to the supremum distance in terms of the coordinates θ, p). Now, therefore it suffices to show that µ(B δ (Γ x )) → 0 uniformly for all x ∈ K as δ → 0. Suppose that this is not the case. Then there are sequences δ m → 0 and x m → x ∈ K such that µ(B δm (Γ xm )) does not tend to zero. However, we also have This contradiction finishes the proof.
Proof. We will use some properties of weak convergence of probability measures; see [Bil99]. Since µ n weakly converges to µ, it follows from [Bil99, Example 3.2] that the product measure µ n × µ n converges weakly to µ × µ on Γ × Γ. Restricting to the set Γ × Γ \ ∆ Γ , the measures µ × µ and µ n × µ n are still probability measures since the diagonal ∆ Γ ⊆ Γ × Γ has zero measure because µ and µ n have no atoms. Moreover, since the diagonal ∆ Γ is a closed set, the product measure µ n × µ n weakly converges to µ × µ on Γ × Γ \ ∆ Γ by condition (iv) of the portmanteau theorem [Bil99, Theorem 2.1]. Furthermore, since the function i : Γ×Γ\∆ Γ → RP 2 is continuous, the pushforward measure i * (µ n × µ n ) weakly converges to i * (µ × µ) by the definition of weak convergence; see [Bil99,p. 14]. Therefore, the area measure area µn = 1 8π i * (µ n × µ n ) weakly converges to area µ , with both area measures considered as probability measures on the projective plane RP 2 ; see (11.2). Finally, to show that area µn (K) → area µ (K), we must check, according to part (v) of the portmanteau theorem [Bil99, Theorem 2.1], that K is a continuity set of area µ , which by definition means that µ(∂ RP 2 K) = 0. This follows from the facts that K is compact and µ(∂ R 2 K) = 0.
Consider the disk D µ + ε (r) centered at O of radius r for the distance d µ + ε . The numbers r > 0 and ε > 0 are small enough so that the truncations of µ ext and λ have no effect on the disk D µ + ε (r). The number r is fixed while ε goes to 0.
Proposition 11.7. The disk D µ + ε (r) is a projective Finsler disk with minimizing interior geodesics, whose area converges to 6 π r 2 , as ε goes to zero. Therefore, the area lower bound in Theorem 1.2 is sharp.
Proof. The fact that d µ + ε is a projective Finsler metric follows from Theorem 11.2 and the fact that its geodesics are minimizing was stated in Remark 11.3.

Appendix: Differentiability of distance-realizing paths on Finsler surfaces with boundary
Consider a smooth manifold M with smooth boundary endowed with a Finsler metric F . Recall that a distance-realizing curve is a curve α : I → M defined on an interval I ⊆ R such that If the manifold has an empty boundary (or, more generally, a convex boundary), then its distance-realizing curves satisfy a differential equation, and it is therefore clear that they are smooth. However, if the boundary is not convex, then the distance-realizing curves are not C 2 in general, and they are not even determined by their initial velocity vector. This happens, for instance, on the Euclidean plane minus an open disk.
In the case of Riemannian manifolds with boundary, it was claimed in [Wol79] and [AA81] that distance-realizing curves are C 1 . This result can also be recovered from [LY06] by gluing together two copies of a Riemannian manifold M along their boundaries. The Riemannian metric obtained on the resulting double manifold N is α-Hölder continuous for any α ∈ (0, 1]; see [LY06, Example 3.3]. By [LY06], the geodesics on N are C 1 (and even C 1, α 2−α ), from which we can deduce that the distance-realizing curves on M are also C 1 . This argument does not hold for Finsler metrics. Indeed, the double of a Finsler metric is not even a continuous Finsler metric in general.
Here, by adapting the argument of [AA81], we prove that the same result holds for Finsler surfaces.
Theorem 12.1. On a Finsler surface M with boundary, every distancerealizing curve α : I → M is C 1 . Furthermore, the velocity vectors α (t) have unit norm.
Let us introduce some technical definitions. We assume without loss of generality that the surface M is the closed upper half of R 2 .
Definition 12.2. Let α : I → M be a continuous curve, where I ⊆ R is an interval. Fix t 0 ∈ I and denote x 0 = α(t 0 ). An arrival velocity of α at t 0 is a vector v ∈ T x 0 M that is an accumulation point of the set of vectors as t goes to t 0 . Similarly, a departure velocity of α at t 0 is a vector v ∈ T x 0 M that is an accumulation point of the set of vectors as t goes to t 0 . Note that if α is differentiable on the left (resp. right) at t 0 , then α has exactly one arrival (resp. departure) velocity at t 0 .
We begin by proving a weak differentiability result.
Lemma 12.3. Let (M, F ) be a Finsler manifold with boundary and let α : I → M be a distance-realizing curve. Fix t 0 ∈ I and denote x 0 = α(t 0 ). Then (1) The curve α has at least one arrival velocity and one departure velocity at t 0 (unless t 0 = min I or t 0 = max I, respectively).
(3) If the curve α is differentiable on one side at an interior point t 0 of I, then α is differentiable at t 0 .
Proof. By continuity of the Finsler metric at x 0 , we can bound F x below and above by two multiples of the norm F x 0 = | · | for every x close enough to x 0 . That is, λ − |v| ≤ F x (v) ≤ λ + |v| for every v ∈ R n , which in turn implies that This implies that the sets of vectors V ± are bounded when t goes to t 0 , which implies the first claim. In fact, as x goes to x 0 , the optimal coefficients λ ± converge to 1, which implies the second claim.
To prove the last claim, we assume that the curve α is differentiable on the left at an interior point t 0 of I. (The argument is similar if α is differentiable on the right at t 0 .) Let v − be the arrival tangent vector. Let us prove that α is differentiable on the right at t 0 and has departure tangent vector v + = v − . By contradiction, assume that the set of vectors V + has an accumulation point v + = v − as t goes to t 0 . As already noticed in the second claim, we have |v − | = |v + | = 1. Since the norm F x 0 = | · | is strictly convex, we also have |v − + v + | < 2. Let τ m → 0 be a decreasing sequence of positive numbers such that y + m = α(t 0 + τ m ) = α(t 0 ) + τ m v + + o(τ m ). Since α is differentiable on the left at t 0 , we also have For m large enough, we can take λ + arbitrarily close to 1. It follows from the inequality |v + + v − | < 2 that d F (y − m , y + m ) < 2τ m contradicting that α is a distance-realizing curve.
Before proceeding to the proof of Theorem 12.1, we extend the Finsler metric F to a surface M + ⊇ M with empty boundary; see Remark 2.2 As for any Finsler surface with empty boundary, every point of M + has a normal neighborhood, that is, an open neighborhood U such that for any two points x, y ∈ U , there is a unique geodesic from x to y contained in U and this geodesic is the unique distance-realizing arc from x to y in M + ; see [BCS00,p. 160]. Note that if this geodesic is contained in M , then it is also the unique distance-realizing arc from x to y in M .
Proof of Theorem 12.1. We assume first that the metric is self-reverse.
Let α : I → M be a distance-realizing curve. Let t 0 ∈ I and let x 0 = α(t 0 ). If x 0 = α(t 0 ) lies in the interior of M then the arc α coincides with a geodesic in a neighborhood of t 0 , where it is C 1 (and we are done). Thus, we can assume that x 0 lies in ∂M .
Again, we assume without loss of generality by working in a small enough neighborhood of x 0 that M is a closed half-space of M + = R 2 and that every geodesic arc is a unique distance-realizing arc.
Suppose that the arc α is not differentiable on the right at some t 0 ∈ I. (The argument is similar if α is not differentiable on the left at t 0 .) The arc α has two departure velocities v and w. Let K v and K w be two convex cones based at x 0 that contain the points x 0 + v and x 0 + w in their interior and only meet at x 0 . Take a unit vector u ∈ T x 0 M not tangent to ∂M that points in the interior of M and separates K v from K w , and denote by γ u the geodesic with initial velocity γ u (t 0 ) = u. This geodesic does not visit K v nor K w in some interval (t 0 , t 2 ). On the other hand, the arc α(t) visits the cones K v and K w infinitely many times in any interval (t 0 , τ ), with τ > t 0 . Therefore, it must cross the geodesic γ u at some time t 1 ∈ (t 0 , t 2 ). Since γ u is the unique distance-realizing path between any of its points, the arc α coincides with γ u in [t 0 , t 1 ]. Thus α does not visit K v and K w in (t 0 , t 1 ). This contradiction proves that α is differentiable on the right at t 0 . It follows from Lemma 12.3 that α is differentiable at every interior point t 0 ∈ I.
Suppose α is not C 1 on the right at t 0 . (The argument is similar in case it is not C 1 on the left.) The vector v = α (t 0 ) points inside M or is tangent to the boundary of M . Since the velocities α (t) are unit vectors and the curve α is not C 1 on the right at t 0 , its derivative α has an accumulation point w = v when t goes to t 0 from the right. Let u be a unit vector spanning a line that separates v from w. Consider three disjoint neighbourhoods U, V, W of u, v, w such that for every u , v , w in U, V, W respectively, the line spanned by u separates v from w . Let K V be the union of the rays contained in M starting at x 0 with direction v ∈ V , and let R be any of these rays. Note that u is transverse to all these rays. Working in a small enough neighbourhood of x 0 , we can assume that the family Γ of geodesics that visit R with velocity u foliates the cone K V , and that their tangent vectors do not deviate too much from u and thus lie in U . Since the velocity of α at t 0 lies in the open set V , the arc α restricted to some nontrivial interval [t 0 , t 3 ) lies in K V . Now, since w is an accumulation point for α when t goes to t 0 from the right, there exists t 2 ∈ (t 0 , t 3 ) such that w = α (t 2 ) lies in W . Let x 2 = α(t 2 ), and let v be the direction from x 0 to x 2 . Let γ be the geodesic of Γ passing through x 2 , and let u ∈ U be its velocity at x 2 . The vector w = α (t 2 ) points strictly inside the region of M delimited by γ containing x 0 , since the vector v points outside, and the line generated by the vector u separates v from w . Therefore, the arc α starting at x 0 must cross γ a first time at t 1 ∈ (t 0 , t 2 ) before crossing it again at t 2 . Since γ is the unique distance-realizing path between α(t 1 ) and α(t 2 ), the arc α coincides with γ in [t 1 , t 2 ], which contradicts the fact that α is transverse to γ at t 2 (or t 1 ). This finishes the proof of 12.1 for self-reverse metrics.
In the case of directed metrics we adapt the argument as follows. Apart from the foliation Γ, we need a second foliation Γ − of K V by geodesics transverse to the ray R with initial velocity −u. Then we proceed as in the proof and after choosing the point x 1 in K V , we let γ and γ − be the two geodesics of Γ and Γ − passing through x 1 . We keep only the part of each geodesic before it reaches x 1 and discard the rest. These two half geodesics delimit a region of K V containing x 0 . The curve α points strictly inside this region at x 1 . Therefore, it must cross either γ or γ − a first time before reaching x 1 . We derive a contradiction as in the previous proof.