1 Introduction

In this paper we investigate the so called metastable exit times for the stochastic differential equation

$$\begin{aligned} dX_t = -\nabla F(X_t) dt + \frac{\sqrt{\varepsilon }}{2} dB_t, \end{aligned}$$
(1.1)

where F is a smooth potential with many local minima and \(\varepsilon \) is a small number.

The main question of metastability is to determine how much time the process (1.1) takes from going from one local minima to another one. We call these the metastable exit times. This question has a rich history and in the double well case with non-degenerate minima and a saddle point this is characterized by a formula called Eyring–Kramers law [11, 17] which can be stated as follows: Assume that x and y are quadratic local minima of F, separated by a unique saddle z which is such that the Hessian has a single negative eigenvalue \(\lambda _1(z)\). Then the expected transition time \(\tau \) from x to y satisfies

$$\begin{aligned} {\mathbb {E}}^x[\tau ] \simeq \frac{2\pi }{|\lambda _1(z)|} \sqrt{\frac{|\det (\nabla ^2 F(z))|}{\det (\nabla ^2 F(x))}} e^{(F(z)-F(x))/\varepsilon }, \end{aligned}$$
(1.2)

where \(\simeq \) denotes that the comparison constant tends to 1 as \(\varepsilon \rightarrow 0\).

The validity of the above formula has been studied, from a qualitative perspective, quite extensively, starting from the work of Freidlin and Wentzell. For more information, see the book [12]. Roughly 15 years ago, Bovier et. al. produced a series of papers [6,7,8,9] (see also [5]) which provided the first proof of (1.2) in the general setting of Morse functions. Specifically, they showed that the comparison function is like \(1+{\mathcal {O}}(\varepsilon ^{1/2}|\log \varepsilon |^{3/2})\). In these papers, they utilized the connection to classical potential theory in order to reduce the problem of estimating metastable exit times to the problem of estimating certain capacities sharply. This approach was later used in [4] to generalize (1.2) to general polynomial type of degeneracies.

In this paper we are interested in estimating the metastable exit times in the case of general type of degenerate critical points. This requires new techniques and effective notation from geometric function theory which we will describe below. Our motivation comes from the field of non-convex optimization where we cannot expect the minima/saddles to be quadratic or even to have polynomial growth in any direction. In particular, such situations are well known in the context of neural networks, where the minima and saddles may be completely flat in some directions [15]. Furthermore, it seems that they are preferrable, see [19] for a discussion, see also [3] for an explicit example.

The main goal is to estimate the dependency of the metastable exit times with respect to the geometry of the potential F. In the proof of (1.2) in [8] this is reduced to estimating the ratio of the \(L^1\) norm of the hitting probability and the capacity. Thus in order to estimate the metastable exit times, one needs to produce

  1. (1)

    Estimates of the \(L^1\) integral of the hitting probability, i.e. the integral of capacitary potentials with respect to the Gibbs measure.

  2. (2)

    Estimates of the capacity itself, i.e. estimates of the energy of the capacitary potentials.

The interesting point is that the influence of F on 1 and 2 is in a sense dual. Specifically, the shape of minima of F influence 1 while the shape of saddles between minima influence 2. As is well known, the main difficulty is to estimate 2, which is an interesting topic of its own.

Our main contribution is a sharp capacity estimate for a very general class of degenerate saddle points. In order to achieve this, we phrase the problem in the language of geometric function theory, where the capacity estimates are a central topic [13, 20]. We introduce two geometric quantities which allow us to estimate the capacity in a sharp and natural way. As a byproduct, we see that in the case of several saddle points at the same height, the topology dictates how the local capacities add up. Here we resctrict ourselves in two topological cases, which we call the parallel and the serial case, and it turns out that the formulas for the total capacity have natural counterparts in electrical networks of capacitors, see Theorem 1. Even in the context of non-degenerate saddles, our formulas provide a generalization of the result of [8] where the authors consider only the parallel case. As we mentioned, we allow the saddle points to be degenerate but we have to assume that saddles are non-branching, see (1.6).

1.1 Assumptions and statement of the main results

In order to state our main results we first need to introduce our assumptions on the potential F. We also need to introduce notation from geometric function theory which might seem rather heavy at first, but it turns out to be robust enough for us to treat the potentials with possible degenerate critical points.

Let us first introduce some general terminology. We say that a critical point z of a function \(f \in C^1({\mathbb {R}}^n)\) is a local minimum (maximum) of f if \(f(x) \ge f(z)\) (\(f(x) \le f(z)\)) in a neighborhood of z. If f is not locally constant at a critical point z, then z is a saddle point if it is not a local minimum / maximum. For technical reasons we also allow saddle points to include points z where f is locally constant. We say that a local minimum at z is proper if there exists a \({\hat{\delta }} > 0\) such that for every \(0< \delta < {\hat{\delta }}\) there exists a \(\rho \) such that

$$\begin{aligned} f(x) \ge f(z) + \delta \quad \text {for all } \, x \in \partial B_{\rho }(z), \end{aligned}$$

where \(B_\rho (z)\) denotes an open ball with radius \(\rho \) centered at z. When the center is at the origin we use the short notation \(B_\rho \).

Let us then proceed to our assumptions on the potential F. Throughout the paper we assume that \(F \in C^2({\mathbb {R}}^n)\) and satisfies the following quadratic growth condition

$$\begin{aligned} F(x) \ge \frac{ |x|^2}{C_0} - C_0 \end{aligned}$$
(1.3)

for a constant \(C_0 \ge 1\). We assume that every local minimum point z of F is proper, as described above, and that there is a convex function \(G_z: {\mathbb {R}}^n \rightarrow {\mathbb {R}}\) which has a proper minimum at 0 with \(G(0)= 0\) such that

$$\begin{aligned} \big | F(x+z)- F(z)- G_z(x) \big |\le \omega \big ( G_z(x)\big ), \end{aligned}$$
(1.4)

where \(\omega : [0,\infty ) \rightarrow [0,\infty )\) is a continuous and increasing function with

$$\begin{aligned} \lim _{s \rightarrow 0} \frac{\omega (s)}{s} = 0. \end{aligned}$$
(1.5)

We denote by \(\delta _0\) the largest number for which \(\omega (\delta ) \le \frac{\delta }{8}\) for all \(\delta \le 4 \delta _0\). We define a neighborhood of the local minimum point z and \(\delta < \delta _0\) as

$$\begin{aligned} O_{z,\delta }:= \{ x \in {\mathbb {R}}^n: G_z(x) < \delta \} +\{ z\}. \end{aligned}$$

For the saddles, we assume that for every saddle point z of F there are convex functions \(g_z: {\mathbb {R}}\rightarrow {\mathbb {R}}\) and \(G_z:{\mathbb {R}}^{n-1} \rightarrow {\mathbb {R}}\) which have a proper minimum at 0 with \(g_z(0) = G_z(0) = 0\), and that there exists an isometryFootnote 1\(T_z: {\mathbb {R}}^n \rightarrow {\mathbb {R}}^n\) such that, denoting \(x = (x_1, x') \in {\mathbb {R}}\times {\mathbb {R}}^{n-1}\), it holds

$$\begin{aligned} \big | (F\circ T_z) (x) -F(z) + g_z(x_1) - G_z(x')\big |\le \omega ( g_z(x_1)) + \omega ( G_z(x')), \end{aligned}$$
(1.6)

where \(\omega : [0,\infty ) \rightarrow [0,\infty )\) is as in (1.5). The assumption (1.6) allows the saddle point to be degenerate, but we do not allow them to have many branches, i.e., the sets \(\{F < F(z)\}\cap B_{\rho }(z)\) cannot have more than two components. Note that the convex functions \(g_z, G_z\) and the isometry \(T_z\) depend on z, while the function \(\omega \) is the same for all saddle points. We define a neighborhood of the saddle point z and \(\delta < \delta _0\) as

$$\begin{aligned} O_{z,\delta }:= T_z\left( \{x_1 \in {\mathbb {R}}: g_z(x_1)< \delta \} \times \{x' \in {\mathbb {R}}^{n-1}: G_z(x') < \delta \}\right) , \end{aligned}$$
(1.7)

where \(T_z\) is the isometry in (1.6). Note that, since the saddle may be flat, we should talk about sets rather than points. However, we adopt the convention that we always choose a representative point from each saddle (set) and thus we may label the saddles by points \(z_1, z_2, \dots \). Moreover, we assume that there is a \(\delta _1 \le \delta _0\) such that for \(\delta < \delta _1\) we have that if \(z_1\) and \(z_2\) are two different saddle points, then their neighborhoods \(O_{z_1, 3\delta }\) and \(O_{z_2, 3\delta }\) defined in (1.7) are disjoint. We assume the same for local minima (or more precisely, the representative points of sets of local minima).

Let us then introduce the notation related to geometric function theory [13, 20]. Let us fix two disjoint sets A and B in a domain \(\Omega \) (open and connected set). We say that a smooth path \(\gamma :[0,1] \rightarrow {\mathbb {R}}^n\) connects A and B in the domain \(\Omega \) if

$$\begin{aligned} \gamma (0) \in A, \quad \gamma (1) \in B \quad \text {and} \quad \gamma ([0,1]) \subset \Omega . \end{aligned}$$

We denote the set of all paths connecting A and B inside \(\Omega \) as \({\mathcal {C}}(A,B; \Omega )\). We follow the standard notation and define a dual object to this by saying that a smooth hypersurface \(S \subset {\mathbb {R}}^n\) (possibly with boundary) separates A from B in \(\Omega \) if every path \(\gamma \in {\mathcal {C}}(A,B; \Omega )\) intersects S. We denote the set of smooth hypersurfaces, separating A and B inside \(\Omega \) as \({\mathcal {S}}(A,B;\Omega )\). We define the geodesic distance between A and B in \(\Omega \) as

$$\begin{aligned} d_{\varepsilon }(A,B; \Omega ):= \inf \left( \int _0^1 |\gamma '(t)| e^{\frac{F(\gamma (t))}{\varepsilon }} \,: \gamma \in {\mathcal {C}}(A,B; \Omega ) \right) \end{aligned}$$
(1.8)

and its dual, which we call the minimal cut, by

$$\begin{aligned} V_{\varepsilon }(A,B; \Omega ):= \inf \left( \int _{S} e^{-\frac{F(x)}{\varepsilon }} \, d {\mathcal {H}}^{n-1}(x):S \in {\mathcal {S}}(A,B;\Omega ) \right) . \end{aligned}$$
(1.9)

Whenever \(\Omega = {\mathbb {R}}^n\) we instead use the notation \(d_\varepsilon (A,B)\) and \(V_\varepsilon (A,B)\). Here \({\mathcal {H}}^{k}\) denotes the k-dimensional Hausdorff measure. Finally, we define the communication height between the sets A and B as

$$\begin{aligned} F(A; B):= \inf _{\gamma \in {\mathcal {C}}(A,B; {\mathbb {R}}^n)} \sup _{t \in [0,1]} \, F(\gamma (t)). \end{aligned}$$

Let us then assume that \(x_a\) and \(x_b\) are local minimum points and denote the communication height between \(x_a\) and \(x_b\) as

$$\begin{aligned} F(x_a; x_b):= F(\{x_a\}; \{x_b\}). \end{aligned}$$

Notice that \(F(x_a), F(x_b) \le F(x_a; x_b)\). For \(s \in {\mathbb {R}}\), denote

$$\begin{aligned} U_s: = \{ x \in {\mathbb {R}}^n: F(x) < F(x_a; x_b) + s\}. \end{aligned}$$

Assuming that \(0 < \delta \le \delta _1\), We note that the points \(x_a\) and \(x_b\) lie in different components of the set \(U_{-\delta /3}\) while they are in the same component of the set \(U_{\delta /3}\). We will always denote the components of \(U_{-\delta /3}\) containing the points \(x_a\) and \(x_b\) by \(U_{x_a}\) and \(U_{x_b}\), respectively. It is important to notice that if z is a saddle point and \(F(z) < F(x_a; x_b) + \delta /3\), then the neighborhood \(O_{z,\delta }\) defined in (1.7) intersects the set \(U_{-\delta /3}\). We will sometimes call the components of the set \(U_{-\delta /3}\) islands and the neighborhoods \(O_{z,\delta }\) bridges since we may connect islands with bridges, see Fig. 1. (The terminology is obviously taken from the Seven Bridges of Königsberg). We say that the set of saddle points \(Z_{x_a,x_b}= \{z_1, \dots , z_N\}\) charge capacity if it is the smallest set with the property that every \(\gamma \in {\mathcal {C}}(B_\varepsilon (x_a),B_\varepsilon (x_a); U_{\delta /3})\) intersects the bridge \(O_{z_i,\delta }\), defined in (1.7), for some \(z_i \in Z_{x_a,x_b}\). In particular, it holds that \(Z_{x_a,x_b} \subset U_{\delta /3}\).

Fig. 1
figure 1

The neighborhood \(O_{z, \delta }\) of the saddle point z connects the sets \(U_{x_a}\) and \(U_{x_b}\)

We will focus on two different topological situations, where the saddle points in \(Z_{x_a,x_b} \) are either parallel or in series. We say that the points in \(Z_{x_a,x_b}\) are parallel if for every \(z_i \in Z_{x_a,x_b}\) there is a path

$$\begin{aligned} \gamma \in {\mathcal {C}}(B_\varepsilon (x_a),B_\varepsilon (x_b); U_{\delta /3}) \end{aligned}$$

passing only through \(z_i\). We say that the points in \(Z_{x_a,x_b}\) are in series if every path \(\gamma \in {\mathcal {C}}(B_\varepsilon (x_a),B_\varepsilon (x_b); U_{\delta /3})\) passes through the bridge \(O_{z_i,\delta }\), defined in (1.7), for all \(z_i \in Z_{x_a,x_b}\). In other words, if the points in \(Z_{x_a,x_b}= \{z_1, \dots , z_N\}\) are parallel, then the islands occupied by the points \(x_a\) and \(x_b\) respectively are connected with N bridges and we need to pass only one to get from \(x_a\) to \(x_b\). If they are in series, then we have to pass all N bridges in order to get from \(x_a\) to \(x_b\), see Fig. 2.

Fig. 2
figure 2

Left picture is the parallel case and the right is the series case

Recall that \(U_{x_a}\) and \(U_{x_b}\) denote the islands, i.e., the components of \(U_{-\delta /3}\), which contain the points \(x_a\) and \(x_b\). If the points in \(Z_{x_a,x_b} = \{ z_1, \dots , z_N\}\) are parallel, then it follows from our assumptions on F that we may connect \({U_{x_a}}\) and \({U_{x_b}}\) with one bridge, i.e., for every \(z_i\) the set

$$\begin{aligned} U_{z_i,\delta }:= O_{z_i, \delta } \cup U_{x_a} \cup U_{x_b} \end{aligned}$$
(1.10)

is connected, again see Fig. 1. Then all paths \(\gamma \in {\mathcal {C}}(B_\varepsilon (x_a),B_\varepsilon (x_b); U_{z_i,\delta })\) pass through the bridge \(O_{z_i,\delta }\). If the points in \(Z_{x_a,x_b}\) are in series, then it is useful to order them \(Z_{x_a,x_b} = \{ z_1, \dots , z_N\}\) as follows. Let us consider a path \(\gamma \in {\mathcal {C}}(B_\varepsilon (x_a),B_\varepsilon (x_b); U_{\delta /3})\) which passes through each point in \(Z_{x_a,x_b}\) precisely once. This means that there are

$$\begin{aligned} 0< t_1< \dots< t_N <1 \quad \text { such that } \quad \gamma (t_i) = z_i, \end{aligned}$$
(1.11)

which gives a natural ordering for points in \(Z_{x_a,x_b}\). By the assumption (1.6), we also deduce that there are \(s_1, \dots , s_{N-1}\) such that \(t_i< s_i < t_{i+1}\) and \(\min \{ F(\gamma (s_i)), F(\gamma (s_{i+1}))\} < F(z_{i+1}) - \delta /3\). We denote

$$\begin{aligned} \gamma (s_i) = x_i, \,\, x_0 = x_a \quad \text { and } \quad x_{N} = x_b. \end{aligned}$$
(1.12)

The idea is that then every point \(x_i\) lie in a different island, i.e., component of \(U_{-\delta /3}\) which we denote by \(U_{x_i}\), see Fig. 2. We may also choose \(x_i\) such that they are local minimum points of F. Again it follows from our assumptions on F that the set \( \Omega = \bigcup _{i=1}^N O_{z_i, \delta } \cup U_{x_i} \cup U_{x_a}\) is connected.

We are now ready to state our main results. The first result is a quantitative lower bound on the capacity between the sets \(B_\varepsilon (x_a)\) and \(B_\varepsilon (x_b)\), where \(x_a\) and \(x_b\) are two local minimum points of F. For a given domain \(\Omega \subset {\mathbb {R}}^n\) we define the capacity of two disjoint sets \(A, B \subset \Omega \) with respect to the domain \(\Omega \) as

$$\begin{aligned} {\text {cap}}(A,B; \Omega ):= \inf \left( \varepsilon \int _{\Omega } |\nabla u|^2 e^{-\frac{F}{\varepsilon }}\, dx \,: \,\, u=1 \,\, \text {in } \, A, \,\, u=0 \,\, \text {in } \, B \right) . \end{aligned}$$

Above, the infimum is taken over functions \(u \in W_{loc}^{1,2}(\Omega )\). In the case \(\Omega = {\mathbb {R}}^n\) we denote

$$\begin{aligned} {\text {cap}}(A,B) = {\text {cap}}(A,B; {\mathbb {R}}^n) \end{aligned}$$

for short.

Finally, for functions f and g which depend continuously on \(\varepsilon >0\), we adopt the notation

$$\begin{aligned} f(\varepsilon ) \simeq g(\varepsilon ) \end{aligned}$$

when there exists a constant C depending only on the data of the problem such that

$$\begin{aligned} (1-{\hat{\eta }}(C,\varepsilon )) f(\varepsilon ) \le g(\varepsilon ) \le (1+{\hat{\eta }}(C,\varepsilon )) f(\varepsilon ), \end{aligned}$$

where \({\hat{\eta }}(C,\cdot )\) is an increasing and continuous function \({\hat{\eta }}(C,\cdot ):[0,\infty ) \rightarrow [0,\infty )\) with \(\lim _{s \rightarrow 0} {\hat{\eta }}(C,\cdot ) = 0\). In all our estimates, the function \({\hat{\eta }}\) is specified and depends only on the function \(\omega \) from (1.4) and (1.6). In order to define it, we first let \(0 < \varepsilon \le \delta _0/2\) be fixed and let \(\varepsilon _1(\varepsilon )\) be the unique solution to

$$\begin{aligned} \sqrt{\omega (\varepsilon _1) \varepsilon _1} = \varepsilon . \end{aligned}$$
(1.13)

From the assumption that \(\omega (s) < s/2\) for \(s < \delta _0\) we see that \(\varepsilon < \varepsilon _1\). Furthermore, since \(\omega \) is increasing we get that \(\varepsilon _1 \rightarrow 0\) as \(\varepsilon \rightarrow 0\). Now, from the definition of \(\varepsilon _1\) in (1.13) we see, using \(\lim _{s \rightarrow 0} \frac{\omega (s)}{s} = 0\) and \(\varepsilon _1 \rightarrow 0\) as \(\varepsilon \rightarrow 0\), that

$$\begin{aligned} \frac{\varepsilon _1}{ \varepsilon } = \frac{\sqrt{\varepsilon _1}}{\sqrt{\omega (\varepsilon _1)}} \rightarrow \infty \qquad \text {as }\, \varepsilon \rightarrow 0. \end{aligned}$$

On the other hand, again using the same facts, we see that

$$\begin{aligned} \frac{\omega (\varepsilon _1)}{ \varepsilon } = \frac{\sqrt{\omega (\varepsilon _1)}}{\sqrt{\varepsilon _1}} \rightarrow 0\qquad \text {as }\, \varepsilon \rightarrow 0. \end{aligned}$$

Thus

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \frac{\varepsilon _1}{ \varepsilon } = \infty \quad \text {and} \quad \lim _{\varepsilon \rightarrow 0} \frac{\omega (\varepsilon _1)}{ \varepsilon } = 0. \end{aligned}$$
(1.14)

In the following we will denote

$$\begin{aligned} \eta (x)&= e^{-1/x} x^n \quad \text {and} \end{aligned}$$
(1.15)
$$\begin{aligned} {\hat{\eta }}(C,\varepsilon )&= \max \big \{ \varepsilon , \eta \left( C \frac{\varepsilon _1(\varepsilon )}{\varepsilon } \right) \big \}. \end{aligned}$$
(1.16)

Finally, in our main theorems and our lemmas/propositions beyond Sect. 3 there is a ball \(B_R\) which contains all the level sets of interest. The existence of such a ball is given by the quadratic growth condition (1.3). The constants in the estimates in our main theorems and in Sect. 3 are unless otherwise stated, depending on \(n,\Vert \nabla F\Vert _{B_R}, \delta , R, C_0\), specifically, this applies to the constants in \({\hat{\eta }}\), and as such, gives precise meaning to \(a \simeq b\).

Theorem 1

Assume that F satisfies the structural assumptions above. Let \(x_a\) and \(x_b\) be two local minimum points of F and let \(Z_{x_a,x_b}= \{z_1, \dots , z_N\}\) be the set of saddle points which charge capacity as defined above, and let \(0 < \delta \le \delta _1\) be fixed. There exists an \(0 < \varepsilon _0 \le \delta \) such that if \(0 < \varepsilon \le \varepsilon _0\) the following holds:

If the points in \(Z_{x_a,x_b}= \{z_1, \dots , z_N\}\) are parallel, then, using the notation \(U_{z_i,\delta }\) from (1.10), it holds

$$\begin{aligned} {\text {cap}}(B_\varepsilon (x_a), B_\varepsilon (x_b)) \simeq \sum _{i=1}^N {\text {cap}}(B_\varepsilon (x_a), B_\varepsilon (x_b); U_{z_i,\delta }). \end{aligned}$$
(1.17)

Moreover for all \(i =1,\dots , N\) we have the estimate

$$\begin{aligned} {\text {cap}}(B_\varepsilon (x_a), B_\varepsilon (x_b); U_{z_i,\delta }) \simeq \varepsilon \frac{V_{\varepsilon }(B_\varepsilon (x_a),B_\varepsilon (x_b); U_{z_i,\delta }) }{d_{\varepsilon }(B_\varepsilon (x_a),B_\varepsilon (x_b); U_{z_i,\delta }) } e^{\frac{F(z_i)}{\varepsilon }}, \end{aligned}$$

where \(d_{\varepsilon }(B_\varepsilon (x_a),B_\varepsilon (x_b); U_{z_i,\delta }) \) and \(V_{\varepsilon }(B_\varepsilon (x_a),B_\varepsilon (x_b); U_{z_i,\delta })\) are defined in (1.8) and (1.9).

If the points in \(Z_{x_a,x_b}\) are in series, then, using the ordering \(z_1, \dots , z_N\) from (1.11) for the points in \(Z_{x_a,x_b}\) and the points \(x_0, x_1, \dots , x_{N}\) defined in (1.12), it holds

$$\begin{aligned} \frac{1}{{\text {cap}}(B_\varepsilon (x_a), B_\varepsilon (x_b))} \simeq \sum _{i=1}^N \frac{1}{{\text {cap}}(B_\varepsilon (x_{i-1}), B_\varepsilon (x_{i}))}, \end{aligned}$$
(1.18)

where we have the estimate

$$\begin{aligned} {\text {cap}}(B_\varepsilon (x_{i-1}), B_\varepsilon (x_i)) \simeq \varepsilon \frac{V_{\varepsilon }(B_\varepsilon (x_{i-1}), B_\varepsilon (x_i))}{d_{\varepsilon }(B_\varepsilon (x_{i-1}),B_\varepsilon (x_{i})) } e^{\frac{F(z_i)}{\varepsilon }} \end{aligned}$$

for all \(i =1,\dots , N\).

Let us make a few remarks on the statement of the above theorem. First, in the case of a single saddle \(Z_{x_a,x_b}= \{z\}\) the above capacity estimate reduces to

$$\begin{aligned} {\text {cap}}(B_\varepsilon (x_{a}),B_\varepsilon (x_b)) \simeq \varepsilon \frac{V_{\varepsilon }(B_\varepsilon (x_{a}),B_\varepsilon (x_{b}))}{d_{\varepsilon }(B_\varepsilon (x_{a}),B_\varepsilon (x_{b})) } e^{\frac{F(z)}{\varepsilon }}, \end{aligned}$$

where \(d_{\varepsilon }(B_\varepsilon (x_{a}),B_\varepsilon (x_{b}))\) is the geodesic distance between \(B_\varepsilon (x_{a})\) and \(B_\varepsilon (x_{a}),\) and \(V_{\varepsilon }(B_\varepsilon (x_{a}),B_\varepsilon (x_{b}))\) is the area of the ’smallest cross section’. This is in accordance with the classical result on parallel plate capacitors, where the capacity depends linearly on the area and is inversely proportional to their distance.

The statement (1.17), when the saddle points are parallel, means that each saddle point \(z_1, \dots , z_N\) charges capacity and the total capacity is their sum. Again the situation is the same as in the case of parallel plate capacitors with capacity \(C_1, \dots , C_N\), where the total capacity is the sum

$$\begin{aligned} C =C_1 + \dots + C_N. \end{aligned}$$

On the other hand, if the plate capacitors are in series their total capacity satisfies

$$\begin{aligned} \frac{1}{C} = \frac{1}{C_1} + \dots + \frac{1}{C_N} \end{aligned}$$

which is precisely the statement in (1.18).

Using the assumption (1.6) we calculate in Proposition 4.1 and in Proposition 4.2 more explicit, but less geometric, formulas for the single saddle case in a domain \(\Omega \). Namely, we have

$$\begin{aligned} d_{\varepsilon }(B_\varepsilon (x_{a}),B_\varepsilon (x_{b});\Omega ) \simeq e^{\frac{F(z)}{\varepsilon }}\int _{{\mathbb {R}}} e^{-\frac{g_{z}(x_1)}{\varepsilon }} \, d x_1 \end{aligned}$$

and

$$\begin{aligned} V_{\varepsilon }(B_\varepsilon (x_{a}),B_\varepsilon (x_{b});\Omega )\simeq e^{-\frac{F(z)}{\varepsilon }} \int _{{\mathbb {R}}^{n-1}} e^{-\frac{G_{z}(x')}{\varepsilon }} \, d x', \end{aligned}$$

and thus we recover the result in [4]. In particular, if the saddle point is non-degenerate, i.e., \(g_{z}\) and \(G_{z}\) are second order polynomials and the negative eigenvalue of \(\nabla ^2 F(z)\) is \(-\lambda _1\), we may estimate

$$\begin{aligned} d_{\varepsilon }(B_\varepsilon (x_{a}),B_\varepsilon (x_{b})) \simeq \sqrt{\frac{2 \pi \varepsilon }{\lambda _1}}\, e^{\frac{F(z)}{\varepsilon }} \end{aligned}$$

and

$$\begin{aligned} V_{\varepsilon }(B_\varepsilon (x_{a}), B_\varepsilon (x_{b})) \simeq (2 \pi \varepsilon )^{\frac{n-1}{2}} \frac{\sqrt{\lambda _1}\, e^{-\frac{F(z)}{\varepsilon }}}{\sqrt{\det (\nabla ^2 F(z))}}. \end{aligned}$$

In particular, we recover the classical formula (1.2).

Our second main theorem is an estimate on the so called metastable exit times. However, in order to state it we need some further definitions. Assume that the local minima of F are labelled \(x_i\) and ordered such that \(F(x_i) \le F(x_j)\) if \(i \le j\). We will group the minima at the same level using the sets \(G_k\), \(k=1,\ldots , K\), i.e. \(x_i, x_j \in G_k\) if \(F(x_i) = F(x_j)\), and \(x \in G_i\) and \(y \in G_j\), then \(F(x) < F(y)\) for \(i < j\). We also write \(F(G_i):= F(x)\) with \(x \in G_i\). Furthermore, we will denote \(S_k = \bigcup _{i=1}^k G_i\) for \(k=1,\ldots , K\). We will also consider \(G_k^\varepsilon = \bigcup _{x \in G_k} B_\varepsilon (x)\) and \(S_k^\varepsilon = \bigcup _{i=1}^k G_i^\varepsilon \).

In addition to the previous structural assumptions we assume further that for \(\delta _2 \le \delta _1\) small enough, it holds

$$\begin{aligned} F(G_{k+1}) - F(G_k) \ge \delta _2 \end{aligned}$$
(1.19)

for all \(k=1,\ldots ,K\). For a set A we denote \(\tau _A\) the first hitting time of A of the process (1.1), i.e. for \(X_t\) as in (1.1) we define

$$\begin{aligned} \tau _A:= \inf \{ t \ge 0: X_t \in A\}. \end{aligned}$$

In our second theorem we give an upper bound on the hitting time for the process defined by (1.1) to go from a local minimum point in \(G_{k+1}^\varepsilon \) to a lower one in \(S_k\).

Theorem 2

Assume that F satisfies the structural assumptions above, let \(\Omega \) be a domain that contains \(S_{k+1}^\varepsilon \). There exists an \(0 < \varepsilon _0 \le \delta _2\) such that if \(0< \varepsilon < \varepsilon _0\), the following holds:

For \(x \in G_{k+1}^\varepsilon \) we have

$$\begin{aligned} {\mathbb {E}}^x[\tau _{S_k^\varepsilon } {\mathbb {I}}_{\tau _{S_k^\varepsilon } < \tau _{\Omega ^c}}] \le \frac{C e^{-F(G_{k+1})/\varepsilon } \sum _{x \in G_{k+1} } |O_{x,\varepsilon }|}{\max _{x \in G_k, y \in G_{k+1}^\varepsilon } {\text {cap}}(B_\varepsilon (x),B_\varepsilon (y);\Omega )} + C \varepsilon ^{\alpha /2}. \end{aligned}$$

Let \(x_a \in G_k\), \(x_b \in G_{k+1} \) be a pair that maximizes the pairwise capacity. Then, with the notation of Theorem 1, we get in the parallel case

$$\begin{aligned} {\mathbb {E}}^x[\tau _{S_k^\varepsilon } {\mathbb {I}}_{\tau _{S_k^\varepsilon }< \tau _{\Omega ^c}}] \le \varepsilon ^{-1} \frac{C e^{-F(G_{k+1})/\varepsilon } \sum _{x \in G_{k+1} } |O_{x,\varepsilon }|}{\sum _{i=1}^N e^{-F(z_i)/\varepsilon } \frac{{\mathcal {H}}^{n-1}(\{G_{z_i}< \varepsilon \}) }{{\mathcal {H}}^{1}(\{g_{z_i} < \varepsilon \})}} + C \varepsilon ^{\alpha /2}, \end{aligned}$$

and in the series case we get

$$\begin{aligned} {\mathbb {E}}^x[\tau _{S_k^\varepsilon } {\mathbb {I}}_{\tau _{S_k^\varepsilon }< \tau _{\Omega ^c}}] \le \frac{C}{\varepsilon } \sum _{x \in G_{k+1} } \sum _{i=1}^N e^{(F(z_i)-F(G_{k+1}))/\varepsilon } \frac{|O_{x,\varepsilon }| {\mathcal {H}}^{1}(\{g_{z_i}< \varepsilon \}) }{{\mathcal {H}}^{n-1}(\{G_{z_i} < \varepsilon \})} + C \varepsilon ^{\alpha /2}. \end{aligned}$$

The additive error in Theorem 2 can be removed for small \(\varepsilon \) as the right hand side of the above tends to \(\infty \) as \(\varepsilon \rightarrow 0\).

If both the minima and saddles are non-degenerate points, \(\Omega = {\mathbb {R}}^n\) and there is only one saddle z connecting \(x_a,x_b\) (with \(F(x_a)< F(z) < F(x_b)\)), where \(x_a,x_b\) are the only minima of F, then the above estimate coincides with Eyring–Kramers formula (up to a constant)

$$\begin{aligned} {\mathbb {E}}^{x_b}[\tau _{B_\varepsilon (x_a)}] \le C \frac{1}{\lambda _1} \sqrt{\frac{\det (\nabla ^2 F(z))}{\det (\nabla ^2 F(x_b))}} e^{(F(z)-F(x_b))/\varepsilon }. \end{aligned}$$

Here \(\lambda _1\) is the first eigenvalue of the Hessian of F at the saddle z.

2 Preliminaries

The generator of the process (1.1) is the following elliptic operator

$$\begin{aligned} L_\varepsilon = -\varepsilon \Delta + \nabla F \cdot \nabla . \end{aligned}$$
(2.1)

In this section we study the potential and regularity theory associated with the operator (2.1). We provide the identities and pointwise estimates that we will need in the course of the proofs. Most of these are standard, but we provide them adapted to our situation for the reader’s convenience. We note that in this section we only require that the potential F is of class \(C^2\) and satisfy the quadratic growth condition (1.3).

2.1 Potential theory

Definition 1.1

Let \(\Omega \subset {\mathbb {R}}^n\) be a regular domain and let \(G_\Omega (x,y)\) be the Green’s function for \(\Omega \), i.e., for every \(f \in C(\Omega )\) the function

$$\begin{aligned} u_f = \int _{\Omega } G_\Omega (x,y)f(y) dy \end{aligned}$$

is the solution of the Poisson equation

$$\begin{aligned} {\left\{ \begin{array}{ll} L_\varepsilon u_f = f &{} \text {in } \, \Omega \\ u_f = 0 &{} \text {on } \, \partial \Omega . \end{array}\right. } \end{aligned}$$

The natural associated measures are the Gibbs measure \(d\mu _\varepsilon = e^{-F/\varepsilon } dx\) and the Gibbs surface measure \(d\sigma _\varepsilon = e^{-F/\varepsilon } d{\mathcal {H}}^{n-1}\).

Remark 2.2

Note that the Green’s function is symmetric w.r.t. the Gibbs measure, i.e.

$$\begin{aligned} G(x,y) e^{-F(x)/\varepsilon } = G(y,x)e^{-F(y)/\varepsilon }. \end{aligned}$$

We also have the fundamental Green’s identities. Here we assume that \(\Omega \) is a Lipschitz domain and denote the inner normal by n.

Lemma 1.3

Let \(\Omega \) be a smooth domain, \(\psi ,\phi \) be in \(C^2({\overline{\Omega }})\), and \(G_\Omega \) be the Green’s function for \(\Omega \). Then the following Green’s identities holds (Green’s first identity)

$$\begin{aligned} \int _\Omega \psi L_\varepsilon \phi - \varepsilon \nabla \psi \cdot \nabla \phi d\mu _\varepsilon = \varepsilon \int _{\partial \Omega } \psi \nabla \phi \cdot n d\sigma _\varepsilon , \end{aligned}$$
(2.2)

and (Green’s second identity)

$$\begin{aligned} \int _\Omega \psi L_\varepsilon \phi - \phi L_\varepsilon \psi d\mu _\varepsilon = \varepsilon \int _{\partial \Omega } \psi \nabla \phi \cdot n - \phi \nabla \psi \cdot n d\sigma _\varepsilon . \end{aligned}$$
(2.3)

Furthermore, the following (balayage) representation formula holds: for every \(g \in C(\partial \Omega )\) the function

$$\begin{aligned} u(x) = \varepsilon e^{F(x)/\varepsilon } \int _{\partial \Omega } g \nabla _y G_\Omega (y,x) \cdot n d\sigma _\varepsilon (y) \end{aligned}$$
(2.4)

is the solution of the Dirichlet problem

$$\begin{aligned} {\left\{ \begin{array}{ll} L_\varepsilon u = 0 &{}\text {in } \, \Omega \\ u = g &{}\text {on } \, \partial \Omega . \end{array}\right. } \end{aligned}$$

Proof

Integration by parts gives

$$\begin{aligned} \int _\Omega \psi L_\varepsilon \phi d\mu _\varepsilon&= \int _\Omega \psi (-\varepsilon \Delta \phi + \nabla F \cdot \nabla \phi ) d\mu _\varepsilon \\&= \int _\Omega (\varepsilon \nabla \psi \cdot \nabla \phi - \frac{\varepsilon }{\varepsilon } \psi \nabla F \cdot \nabla \phi + \psi \nabla F \cdot \nabla \phi ) d\mu _\varepsilon \\ {}&\quad + \varepsilon \int _{\partial \Omega } \psi \nabla \phi \cdot n d\sigma _\varepsilon \\&=\int _\Omega \varepsilon \nabla \psi \cdot \nabla \phi d\mu _\varepsilon + \varepsilon \int _{\partial \Omega } \psi \nabla \phi \cdot n d\sigma _\varepsilon . \end{aligned}$$

The second Green’s identity follows from the first by applying it twice

$$\begin{aligned}{} & {} \int _\Omega \psi L_\varepsilon \phi - \varepsilon \nabla \psi \cdot \nabla \phi d\mu _\varepsilon -\int _\Omega \phi L_\varepsilon \psi + \varepsilon \nabla \psi \cdot \nabla \phi d\mu _\varepsilon \\{} & {} \qquad = \varepsilon \int _{\partial \Omega } \psi \nabla \phi \cdot n d\sigma _\varepsilon -\varepsilon \int _{\partial \Omega } \phi \nabla \psi \cdot n d\sigma _\varepsilon . \end{aligned}$$

We may now obtain the representation formula for the Dirichlet problem. We choose \(\phi (x) = G_\Omega (x,y)\) and obtain by Green’s second identity that

$$\begin{aligned} \psi (x)e^{-F(x)/\varepsilon } - \varepsilon \int _{\partial \Omega } \psi \nabla \phi \cdot n d\sigma _\varepsilon = \int _\Omega \phi L_\varepsilon \psi d\mu _\varepsilon . \end{aligned}$$

Now relabeling \(x \rightarrow y\), we get the representation formula (2.4). \(\square \)

Recall the definition of capacity (variational): for \(A,B \subset \Omega \) two disjoint compact sets

$$\begin{aligned} {\text {cap}}(A,B; \Omega ):= \inf \left( \varepsilon \int _{\Omega } |\nabla h|^2 e^{-\frac{F}{\varepsilon }}\, dx \,: \,\, h \ge 1 \,\, \text {in } A, u \in H^1_0(\Omega \setminus B) \right) . \end{aligned}$$
(2.5)

The extension of capacity to open sets follows in the classical way

$$\begin{aligned} {\text {cap}}(U,B;\Omega ):= \sup \{{\text {cap}}(A,B;\Omega )\,:\, A\text { compact and }A \subset U\}. \end{aligned}$$

It is well known that for bounded sets with regular boundary the continuity of the capacity implies that \({\text {cap}}(U,B;\Omega ) = {\text {cap}}({\overline{U}},B;\Omega )\). The extension w.r.t the second entry follows similarly. The variational definition of the capacity has many equivalent forms, one that we will need is the one below:

Lemma 1.4

Let \(A,B \subset \Omega \) be two disjoint compact sets. Then the variational formulation of capacity coincides with the balayage definition, i.e.,

$$\begin{aligned} {\text {cap}}(A,B;\Omega ) = \sup \left\{ \int _A e^{-F(y)/\varepsilon } d\mu (y): {\text {supp}}\mu \subset A, \int _{\Omega } G_{\Omega \setminus B}(x,y) d\mu (y) \le 1 \right\} . \end{aligned}$$

The unique measure which maximizes the above, i.e., satisfying

$$\begin{aligned} \int _A e^{-F(y)/\varepsilon } d\mu _{A,B}(y) = {\text {cap}}(A,B;\Omega ), \quad \int _{\Omega } G_{\Omega \setminus B}(x,y) d\mu (y) \le 1, \end{aligned}$$

is called the equilibrium measure \(\mu _{A,B}\). The corresponding equilibrium potential is defined as \(h_{A,B} = \int _{\Omega } G_{\Omega {\setminus } B} (x,y) d\mu _{A,B}(y)\) and is the minimizer of (2.5).

If in addition AB are smooth, then we have

$$\begin{aligned} \begin{aligned} {\text {cap}}(A,B;\Omega )&= \int _A e^{-F(y)/\varepsilon } d\mu _{A,B}(y) = \varepsilon \int |\nabla h_{A,B}|^2 d\mu _\varepsilon \\&= -\varepsilon \int _{\partial A} \nabla h_{A,B} \cdot n d\sigma _\varepsilon . \end{aligned} \end{aligned}$$
(2.6)

Proof

The claim follows from the symmetry of the Green’s function, Remark 2.2, and the strong maximum principle that \(h_{A,B} = 1\) in A, see [1, 10]. From (2.2) we get that

$$\begin{aligned} \int _{\Omega \setminus B} h_{A,B} L_\varepsilon h_{A,B} - \varepsilon |\nabla h_{A,B}|^2 d\mu _\varepsilon = \varepsilon \int _{\partial (\Omega \setminus B)} h_{A,B} \nabla h_{A,B} \cdot n d\sigma _\varepsilon . \end{aligned}$$

Using \(h_{A,B} = 0\) on \(\partial (\Omega \setminus B)\) we see that the right hand side of the above is zero. Moreover, from \(L_\varepsilon h_{A,B} = \mu _{A,B}\) and from the definition of \(d\mu _\varepsilon \) we get

$$\begin{aligned} \int _A e^{-F(y)/\varepsilon } d\mu _{A,B}(y) = \varepsilon \int _{\Omega } |\nabla h_{A,B}|^2 d\mu _\varepsilon . \end{aligned}$$
(2.7)

Note that since \(h_{A,B} = 1\) in A and 0 on \(\partial (\Omega {\setminus } B)\), and since \(L_\varepsilon h_{A,B} = 0\) in \(\Omega {\setminus } (A \cup B)\), we have by the uniqueness of the solution to the Dirichlet problem that \(h_{A,B}\) coincides with the variational minimizer of (2.5). This establishes the first two equalities of (2.6).

To prove the last equality in (2.6) we insert \(h_{A,B} = \phi = \psi \) into (2.2) (Green’s first identity) and get

$$\begin{aligned} \int _{\Omega \setminus (A \cup B)} h_{A,B} L_\varepsilon h_{A,B} - \varepsilon |\nabla h_{A,B}|^2 d\mu _\varepsilon = \varepsilon \int _{\partial (\Omega \setminus (A \cup B))} h_{A,B} \nabla h_{A,B} \cdot n d\sigma _\varepsilon . \end{aligned}$$

Since \({\text {supp}}\mu _{A,B} \subset A\), and \(h_{A,B} = 0\) on B and 1 on A, we get

$$\begin{aligned} - \varepsilon \int _\Omega |\nabla h_{A,B}|^2 d\mu _\varepsilon = \varepsilon \int _{\partial A} \nabla h_{A,B} \cdot n d\sigma _\varepsilon , \end{aligned}$$
(2.8)

where n is the outward unit normal of A. The result now follows from (2.7) and (2.8). \(\square \)

Definition 1.5

Let \(\Omega \) be a smooth domain and \(A \subset \Omega \). Then we define the potential of the equilibrium potential as

$$\begin{aligned} w_{A,\Omega }(x) = \int _\Omega G_{\Omega \setminus A}(x,y) h_{A,\Omega ^c}(y) dy. \end{aligned}$$

The definition of the potential of the equilibrium potential might seem technical at first. However, \(w_{A,\Omega }\) has a clear probabilistic interpretation as the expected hitting time of hitting A of a process killed at \(\partial \Omega \). Indeed, the probabilistic interpretation of \(h_{A,\Omega ^c}\) is \({\mathbb {P}}(\tau _A < \tau _{\Omega ^c})\) i.e. the probability of hitting A before \({\Omega ^c}\). By Dynkin’s formula we see that then

$$\begin{aligned} w_{A,{\Omega }}(x)&= {\mathbb {E}}^x[w_{A,{\Omega }}(X_{\tau _{A \cup {\Omega ^c}}})] - {\mathbb {E}}^x \left[ \int _0^{\tau _{A \cup {\Omega ^c}}} L_\varepsilon w_{A,{\Omega }}(X_t) dt \right] \\&= \int _0^{\infty } {\mathbb {E}}^x \left[ {\mathbb {I}}_{t \le \tau _{A \cup {\Omega ^c}}} {\mathbb {E}}^{X_t}[{\mathbb {I}}_{A}(X_{\tau _{A \cup {\Omega ^c}}})] \right] dt \\&= \int _0^{\infty } {\mathbb {E}}^x \left[ {\mathbb {I}}_{t \le \tau _{A \cup {\Omega ^c}}} {\mathbb {I}}_{A}(X_{\tau _{A \cup {\Omega ^c}}}) \right] dt \\&= {\mathbb {E}}^x[\tau _A {\mathbb {I}}_{\tau _A < \tau _{\Omega ^c}}]. \end{aligned}$$

We also have the following integration by parts formula for the potential of the equilibrium potential:

Lemma 1.6

Let \(\Omega \) be a smooth domain, let \(A \subset \Omega \) be a smooth set, and assume that \(B_{2\rho }(x) \subset \Omega \setminus A\). Then

$$\begin{aligned} \int _{\partial B_\rho (x)} w_{A,\Omega }(y) e^{-F(y)/\varepsilon } d\mu _{B_\rho (x),A}(y) = \int _{\Omega \setminus A} h_{A,\Omega ^c}(z) h_{B_\rho (x),A}(z) d\mu _\varepsilon . \end{aligned}$$

The above statement looks more familiar if we write it in the formal way as

$$\begin{aligned} \int _{\Omega \setminus A} w_{A,\Omega }(y) e^{-F(y)/\varepsilon } d L_\varepsilon h_{B_\rho ,A} (y) = \int _{\Omega \setminus A} L_\varepsilon w_{A,\Omega } e^{-F/\varepsilon } h_{B_\rho ,A} dz, \end{aligned}$$

where \(d L_\varepsilon h_{B_\rho ,A} (y)\) is the equilibrium measure \(d\mu _{B_\rho (x),A}(y)\).

Proof

Using the definition of \(w_{A,\Omega }\) and Fubini’s theorem

$$\begin{aligned}&\int _{\partial B_\rho (x)} w_{A,\Omega }(y) e^{-F(y)/\varepsilon } d\mu _{B_\rho (x),A}(y) \\&\quad = \int _{\partial B_\rho (x)} \left( \int _{\Omega \setminus A} G_{\Omega \setminus A}(y,z) h_{A,\Omega ^c}(z)dz \right) e^{-F(y)/\varepsilon } d\mu _{B_\rho (x),A}(y) \\&\quad = \int _{\Omega \setminus A} h_{A,\Omega ^c}(z) \int _{\partial B_\rho (x)} G_{\Omega \setminus A}(y,z) e^{-F(y)/\varepsilon } d\mu _{B_\rho (x),A}(y) dz. \end{aligned}$$

Using the symmetry of the Green’s function, Remark 2.2,

$$\begin{aligned}&\int _{\Omega \setminus A} h_{A,\Omega ^c}(z) \int _{\partial B_\rho (x)} G_{\Omega \setminus A}(y,z) e^{-F(y)/\varepsilon } d\mu _{B_\rho (x),A}(y) dz \\&\quad = \int _{\Omega \setminus A} h_{A,\Omega ^c}(z) e^{-F(z)/\varepsilon } \int _{\partial B_\rho (x)} G_{\Omega \setminus A}(z,y) d\mu _{B_\rho (x),A}(y) dz. \end{aligned}$$

Note that \({\text {supp}}\mu _{B_\rho (x),A} \subset \partial B_\rho (x)\) and as such

$$\begin{aligned} u (z) = \int _{\Omega } G_{\Omega \setminus A}(z,y) d\mu _{B_\rho (x),A}(y) = \int _{\partial B_\rho (x)} G_{\Omega \setminus A}(z,y) d\mu _{B_\rho (x),A}(y) \end{aligned}$$

solves the equation \(L_\varepsilon u = \mu _{B_\rho (x),A}\) in \(\Omega \) and \(u = 0\) on \(\partial {\Omega \setminus A}\). Consequently, by the uniqueness result to the Dirichlet-Poisson problem, we get \(u(z) = h_{B_\rho (x),A}(z)\). Hence

$$\begin{aligned}{} & {} \int _{\Omega \setminus A} h_{A,\Omega ^c}(z) e^{-F(z)/\varepsilon } \int _{\partial B_\rho (x)} G_{\Omega \setminus A}(z,y) d\mu _{B_\rho (x),A}(y) dz \\{} & {} \quad = \int _{\Omega \setminus A} h_{A,\Omega ^c}(z) h_{B_\rho (x),A}(z) d\mu _\varepsilon (z). \end{aligned}$$

Combining the equalities above yields the result. \(\square \)

2.2 Classical pointwise estimates

In this section we recall classical pointwise estimates for functions which satisfy

$$\begin{aligned} L_\varepsilon u = f \end{aligned}$$

in a domain \(\Omega \), where the operator \(L_\varepsilon \) is defined in (2.1). First, since we assume that \(F \in C^2({\mathbb {R}}^n)\), then for all Hölder continuous f the solutions of the above equation are \(C^{2,\alpha }\)-regular, see [14]. However, these regularity estimates depend on \(\varepsilon \) and blow up as \(\varepsilon \rightarrow 0\). The point is that we may obtain regularity estimates for constants independent of \(\varepsilon \) if we restrict ourselves on small enough scales. To this aim, for a given domain \(\Omega \) we choose a positive number \(\nu \) such that

$$\begin{aligned} \frac{\Vert \nabla F\Vert _{L^\infty (\Omega )}}{\varepsilon } \le \nu . \end{aligned}$$

We have the following two theorems from [14].

Lemma 1.7

(Harnack’s inequality). Let \(\Omega \) be a domain and let \(u \in C^2(\Omega )\) be a non-negative function satisfying \(L_\varepsilon u = 0\). Then for any \(B_{3R}(x) \subset \Omega \) it holds that

$$\begin{aligned} \sup _{B_R(x)} u \le C \inf _{B_R(x)} u \end{aligned}$$

for a constant \(C= C(n,\nu R)\). In particular, if \(\Vert \nabla F\Vert _{L^\infty (\Omega )} \le L\), then for \(R \le \frac{\varepsilon }{L}\) the constant C is independent of \(\varepsilon \). Furthermore, for \(p \in (1,\infty )\) and any number k we have

and

where the symbol denotes the average integral, and the constant \(C_p\) in addition to above depends also on p.

In the non-homogeneous case \(L_\varepsilon u = f\) we have the following generalization of Harnack’s inequality.

Lemma 1.8

Let \(\Omega \) be a domain and let \(u \in C^2(\Omega )\) be a non-negative function satisfying \(L_\varepsilon u = f\). Then for any \(B_{3R}(x) \subset \Omega \) it holds that

$$\begin{aligned} \sup _{B_R(x)} u \le C \left( \inf _{B_R(x)} u + \frac{R}{\varepsilon } \Vert f\Vert _{L^n(B_{2R}(x))} \right) \end{aligned}$$

for a constant \(C= C(n,\nu R)\). In particular, if \(\Vert \nabla F\Vert _{L^\infty (\Omega )} \le L\), then for \(R \le \frac{\varepsilon }{L}\) the constant C is independent of \(\varepsilon \) and we have

$$\begin{aligned} \sup _{B_R(x)} u \le C \left( \inf _{B_R(x)} u + \Vert f\Vert _{L^n(B_{2R}(x))} \right) . \end{aligned}$$

The Harnack inequality in Lemma 2.7 holds also in the case of the punctured ball.

Lemma 1.9

Let \(u \in C^2(B_{3R}(x) \setminus \{x\})\) be a non-negative function satisfying \(L_\varepsilon u = 0\) in \(B_{3R}(x) {\setminus } \{x\}\). Then

$$\begin{aligned} \sup _{\partial B_R(x)} u \le C \inf _{\partial B_R(x)} u \end{aligned}$$

for a constant \(C= C(n,\nu R)\).

Proof

By translating the coordinates we may assume that \(x = 0\). Let \(x_0, y_0 \in \partial B_R\) be such that \(\sup _{\partial B_R(x)} u = u(x_0)\) and \(\inf _{\partial B_R(x)} u = u(y_0)\). We choose points \(x_1, \dots , x_{N-1}, x_N \in \partial B_R\) such that \(|x_i-x_{i-1}| \le R/4\) and \(x_N = y_0\). Note that the number N is bounded. Now we may use Harnack’s inequality Lemma 2.7 in balls \(B_{R/4}(x_i)\) to get

$$\begin{aligned} u(x_{i-1}) \le C u(x_{i}). \end{aligned}$$

We obtain the claim by applying the above over \(i=1, \dots , N\). \(\square \)

Lemma 1.10

Let \(\Omega \) be a domain and let \(u,h \in C^2(\Omega )\) be non-negative functions such that \(L_\varepsilon u = h\) and h satisfies Harnack’s inequality with constant \(c_0\). Then the function \(v = u+h\) satisfies Harnack’s inequality, i.e., for all \(B_{3R}(x) \subset \Omega \) it holds that

$$\begin{aligned} \sup _{B_R(x)} v \le C \inf _{B_R(x)} v \end{aligned}$$

for a constant \(C = C(n,\nu R, R^2/\varepsilon ,c_0)\). In particular, if \(\Vert \nabla F\Vert _{L^\infty (\Omega )} \le L\), then for \(R \le \min \{ \varepsilon /L, \sqrt{\varepsilon }\}\) the constant C is independent of \(\varepsilon \).

Proof

Again we may assume that \(x = 0\). Using Lemma 2.8 and Harnack’s inequality for h yields

$$\begin{aligned} \sup _{B_R} u&\le C \inf _{B_R} u + \frac{C R}{\varepsilon } \Vert h\Vert _{L^n(B_{2R})} \\&\le C \inf _{B_R} u + \frac{C R}{\varepsilon } |B_R|^{1/n} \inf _{B_{R}} h\\&\le C \inf _{B_R} u + \frac{C R^2}{\varepsilon } \inf _{B_{R}} h. \end{aligned}$$

Now, using Harnack’s inequality for h again, we obtain

$$\begin{aligned} \sup _{B_R} v \le \sup _{B_R} u + \sup _{B_R} h \le C \inf _{B_R} u + C \inf _{B_R} h\le C \inf _{B_R} v \end{aligned}$$

for a constant C as in the statement. This proves the claim. \(\square \)

The Harnack’s inequality in Lemma 2.7 implies Hölder continuity for solutions of \(L_\varepsilon u = 0\).

Lemma 1.11

Let \(u \in C^2(B_{3R}(x))\) be a function such that for any constant c, for which \(v = u+c\) is non-negative, the function v satisfies Harnack’s inequality with constant \(C_0\), independent of c. Then there exists \(C = C(C_0) > 1\) and \(\alpha = \alpha (C_0) \in (0,1)\) such that, for all \(\rho \le R\), it holds that

$$\begin{aligned} {\text {osc}}_{B_\rho (x)} u \le C \left( \frac{\rho }{R}\right) ^\alpha {\text {osc}}_{B_{R}(x)} u. \end{aligned}$$

In particular, if \(u,h \in C^2(\Omega )\) are non-negative functions such that \(L_\varepsilon u = h\) and h satisfies Harnack’s inequality with constant \(C_0\), then \(u+h\) satisfies the estimate above.

Proof

The proof follows verbatim from the classical proof of Moser, see [14, Theorem 8.22]. \(\square \)

3 Technical Lemmas

In this section we provide some preliminary results for the proofs of the main theorems. We recall that we assume that the potential F satisfies the structural assumptions from Sect. 1.1, and that from this moment on our constants are allowed to depend on the data, see paragraph after (1.16).

3.1 Rough estimates for potentials

In this subsection we provide estimates for the capacitary potential \(h_{A,B}\), when A and B are two disjoint closed sets. The first estimate is the so called renewal estimate of [8]. In order to trace dependencies of constants, we provide a proof.

Lemma 1.12

Let \(\Omega \) be a smooth domain, let \(A, B \subset \Omega \) be disjoint smooth sets, and consider \(h_{A,B}\) as the capacitary potential in \(\Omega \). Assume that \(B_{4\varrho }(x) \subset (\Omega {\setminus } (A \cup B))\), and that \(r \le \min \left\{ \frac{\varepsilon }{\Vert \nabla F\Vert _{L^\infty (B_{2\varrho }(x))}}, \varrho \right\} \). Then there exists a constant \(C = C(n,\nu ) > 1\) such that

$$\begin{aligned} h_{A,B}(x) \le C \frac{{\text {cap}}(B_{r}(x),A;\Omega )}{{\text {cap}}(B_{r}(x),B;\Omega )}. \end{aligned}$$

Proof

Again, without loss of generality, we may assume that \(x = 0\). Since \(h_{A,B \cup B_r} = h_{A,B}\) on \(\partial (\Omega {\setminus } (A \cup B))\), we can use (2.4) to represent \(h_{A,B}\) as follows

$$\begin{aligned} h_{A,B}(z) = \varepsilon e^{F(z)/\varepsilon } \int _{\partial (\Omega \setminus (A \cup B))} h_{A,B \cup B_r} \nabla G_{\Omega \setminus (A \cup B)}(y,z) \cdot n d\sigma _\varepsilon (y). \end{aligned}$$
(3.1)

Now by Green’s second identity (2.3) in \(\Omega {\setminus } (A \cup B \cup B_r)\) and (3.1) we see that, for \(z \in \Omega \),

$$\begin{aligned} h_{A,B}(z)= & {} h_{A,B \cup B_r}(z) \nonumber \\{} & {} - \varepsilon e^{F(z)/\varepsilon } \int _{\partial B_r} G_{\Omega \setminus (A \cup B)}(y,z) \nabla h_{A,B \cup B_r}(y) \cdot n d\sigma _\varepsilon (y), \end{aligned}$$
(3.2)

where n is the inward unit normal of \(B_r\). First note that by (2.3) we can identify the equilibrium measure as

$$\begin{aligned} \mu _{B \cup B_r,A} = -\varepsilon \nabla h_{B \cup B_r,A} \cdot n d\sigma _\varepsilon = \varepsilon \nabla h_{A,B \cup B_r} \cdot n d\sigma _\varepsilon . \end{aligned}$$

Using that \(h_{A,B\cup B_r} = 1-h_{B \cup B_r, A}\), together with the above and (3.2), we get for \(z \in \overline{B_r}\) (since \(h_{A, B \cup B_r}(z) = 0\)) that

$$\begin{aligned} h_{A,B}(z) = \int _{\partial B_r} G_{\Omega \setminus (A \cup B)}(y,z) e^{(F(z)-F(y))/\varepsilon } d\mu _{B \cup B_r,A}(y). \end{aligned}$$
(3.3)

First note that \(\mu _{B_r \cup B,A} |_{\partial B_r}\) is an admissible measure for \({\text {cap}}(B_r,A;\Omega )\), which follows from the fact that by the comparison principle, the potentials for ordered measures are ordered and the support of \(\mu _{B_r \cup B,A} |_{\partial B_r}\) is in \(\overline{B_r}\). To bound \(h_{A,B}\) from above, note that by the balayage representation of capacity (see Lemma 2.4) and the above, we obtain

$$\begin{aligned} \int _{\partial B_r} e^{-F(y)/\varepsilon } d\mu _{B \cup B_r,A}(y) \le {\text {cap}}(B_r,A;\Omega ). \end{aligned}$$

Applying the above to (3.3) gives, for \(z \in B_r\),

$$\begin{aligned} h_{A,B}(z) \le \sup _{y \in \partial B_r} G_{\Omega \setminus (A \cup B)}(y,z) e^{F(z)/\varepsilon }{\text {cap}}(B_r,A;\Omega ). \end{aligned}$$
(3.4)

It remains to bound the Green’s function. For \(z \in \overline{B_r}\) we have by (2.4) and (2.6) and Remark 2.2 that

$$\begin{aligned} \begin{aligned} 1&= h_{B_r,A \cup B}(z) = \int _{\partial B_r} G_{\Omega \setminus (A \cup B)}(z,y) d\mu _{B_r,A \cup B}(y) \\&= \int _{\partial B_r} G_{\Omega \setminus (A \cup B)}(y,z) e^{(F(z) -F(y))/\varepsilon } d\mu _{B_r,A \cup B}(y) \\&\ge \inf _{\partial B_r} G_{\Omega \setminus (A \cup B)}(y,z) e^{F(z)/\varepsilon } {\text {cap}}(B_r,A \cup B;\Omega ) \\&\ge \inf _{\partial B_r} G_{\Omega \setminus (A \cup B)}(y,z) e^{F(z)/\varepsilon } {\text {cap}}(B_r,B;\Omega ). \end{aligned} \end{aligned}$$
(3.5)

Now putting together (3.4) and (3.5) and Lemma 2.9 we are done. \(\square \)

The result below is a version of the rough capacity bound of [8], but we give a simplified proof. We will later use a similar argument in the proof of Theorem 1.

Lemma 1.13

Let \(D \subset B_R\) be a smooth closed set. Let \(x \in B_R {\setminus } D\) be such that \(B_{4\rho }(x) \subset B_R {\setminus } D\), for \(\rho \le \varepsilon \). Then there exists constants \(q_1,q_2 \in {\mathbb {R}}\) and \(C > 1\) such that

$$\begin{aligned} \frac{1}{C}\varepsilon ^{q_1} \rho ^{n-1} e^{-F(x;D)/\varepsilon } \le {\text {cap}}(B_\rho (x),D) \le C \varepsilon \rho ^{q_2} e^{-F(x;D)/\varepsilon }. \end{aligned}$$

Proof

We assume without loss of generality that \(F(B_\rho ;D) = 0\), since the quantities can always be scaled back. Consider \(\gamma \in {\mathcal {C}}(B_\rho (x),D;B_R)\) (i.e. a curve connecting \(B_\rho (x)\) and D inside \(B_R\)) such that \(\sup _t F(\gamma (t)) \le C\varepsilon \) and let \(u(z) = h_{D,B_\rho (x)}(z)\). We first note by Lemma 2.4 that

$$\begin{aligned} {\text {cap}}(B_\rho (x),D) = \varepsilon \int |\nabla u|^2 e^{-F(y)/\varepsilon } dy. \end{aligned}$$

Fix an \(n-1\) dimensional disk \(D_\rho \) of radius \(\rho \). Then by Cauchy-Schwarz

$$\begin{aligned} \int _{B_R} |\nabla u|^2 e^{-F(y)/\varepsilon } dy \ge \int _{0}^1 \int _{D_\rho } \left| \left\langle \frac{{\dot{\gamma }}}{|{\dot{\gamma }}|}, \nabla u(\gamma (t)+z) \right\rangle \right| ^2 |{\dot{\gamma }}| d\sigma _\varepsilon (z)dt. \end{aligned}$$

By the fundamental theorem of calculus and Cauchy-Schwarz, we have for a fixed point \(z \in D_\rho \) that

$$\begin{aligned} 1&= u(\gamma (1)) - u(\gamma (0)) = \int _0^1 \frac{d}{dt} u(\gamma (t)+z)dt \\&= \int _0^1 \frac{d}{dt} u(\gamma (t)+z)\frac{\sqrt{|{\dot{\gamma }}|}}{\sqrt{|{\dot{\gamma }}|}} e^{-F(\gamma (t)+z)/(2\varepsilon )}e^{F(\gamma (t)+z)/(2\varepsilon )}dt \\&\le \left( \int _0^1 |\frac{d}{dt} u(\gamma (t)+z)|^2 \frac{1}{|{\dot{\gamma }}|}e^{-F(\gamma (t)+z)/\varepsilon } dt \right) ^{1/2} \left( \int |{\dot{\gamma }}| e^{F(\gamma (t)+z)/\varepsilon }dt \right) ^{1/2}. \end{aligned}$$

From the above we get

$$\begin{aligned} \int _{B_R} |\nabla u|^2 e^{-F(y)/\varepsilon } dy&\ge \int _{0}^1 \int _{D_\rho } \left| \left\langle \frac{{\dot{\gamma }}}{|{\dot{\gamma }}|} , \nabla u(\gamma (t)+z) \right\rangle \right| ^2 |{\dot{\gamma }}| d\sigma _\varepsilon (z)dt \\&\ge \int _{D_\rho } \left( \int _0^1 |{\dot{\gamma }}| e^{F(\gamma (t)+z)/\varepsilon } dt \right) ^{-1} d\sigma _\varepsilon (z). \end{aligned}$$

Now since F is Lipschitz in \(B_R\) and \(F(B_\rho ;D) = 0,\) we know that there exists a constant \(C(\gamma )\) such that, for \(z \in D_\rho \) and \(\rho < 2\varepsilon \),

$$\begin{aligned} \int _0^1 |{\dot{\gamma }}| e^{F(\gamma (t)+z)/\varepsilon } dt \le C(\gamma ). \end{aligned}$$
(3.6)

In the above the constant C depends on the length of \(\gamma \), which can be assumed to be bounded. To see this, take an \(\varepsilon \) neighborhood of \(\gamma \), \(E_\varepsilon \) and consider disjoint balls \(B_\varepsilon (y_i) \subset E_\varepsilon \) such that \( E_{\varepsilon } \subset \bigcup _i B_{5\varepsilon }(y_i) \) given e.g. by the Vitali covering lemma. The number of such balls is at most \(C_1 R^n/\varepsilon ^n\), for a dimensional constant \(C_1\). If we construct a piecewise linear curve \(\gamma _\varepsilon \) connecting the center of each ball in the covering, this curve will be inside \(E_{10\varepsilon }\) and its length will be bounded by \(C_2 R^n/\varepsilon ^{n-1}\), for a dimensional constant \(C_2\). This newly constructed curve can be mollified to achieve a smooth curve without increasing the length by more than a factor. From the above and the Lipschitz continuity of F it is clear that \(\sup _t F(\gamma _\varepsilon (t)) \le C\varepsilon \), and as such we can replace \(\gamma \) with \(\gamma _\varepsilon \) in the above and get from (3.6) that there is a constant \(C > 1\) depending only on the data such that

$$\begin{aligned} \int _0^1 |{\dot{\gamma }}| e^{F(\gamma (t)+z)/\varepsilon } dt \le \varepsilon ^{1-n} C. \end{aligned}$$

This implies that for a new constant C we have

$$\begin{aligned} \int _{B_R} |\nabla u|^2 e^{-F(y)/\varepsilon } dy \ge C \varepsilon ^{n-1} \rho ^{n-1}, \end{aligned}$$

which completes the proof of the lower bound after rescaling our potential F.

To prove the upper bound we have two possible cases: In the case when \(F(x;D) = F(x)\) we can take a cutoff function \(\chi _{B_\rho (x)} \le \phi \le \chi _{B_{2\rho }(x)}\) where \(|\nabla \phi | \le C/\rho \) as a competitor in the variational formulation of capacity (2.5). Then

$$\begin{aligned} \int _{B_R} |\nabla \phi |^2 e^{-F(y)/\varepsilon } dx = \int _{B_{2\rho }(x)} |\nabla \phi |^2 e^{-F(y)/\varepsilon } dx \le C \rho ^{n-2}. \end{aligned}$$

In the case where \(F(x;D) > F(x)\), consider the set \({\hat{D}} = \{z \in B_R: F(z) \le F(x;D)\}\) and let \({\hat{D}}_1\) be the component that intersects D. We set \({\widetilde{D}} = ({\hat{D}}_1 \cup D) {\setminus } B_{4\rho }(x)\). By the Lipschitz continuity, we know that \(\inf _{{\widetilde{D}}} F > - C\rho \). We take \(\chi _{{\widetilde{D}} + B_{\rho }} \le 1- \phi \le \chi _{{\widetilde{D}} + B_{2\rho }}\), where \(|\nabla \phi | \le C/\rho \), and get

$$\begin{aligned} \int _{B_R} |\nabla \phi |^2 e^{-F(y)/\varepsilon } dx&= \int _{({\widetilde{D}} + B_{2\rho }) \setminus {\widetilde{D}}} |\nabla \phi |^2 e^{-F(y)/\varepsilon } dx \\&\le C \rho ^{-2}|({\widetilde{D}} + B_{2\rho }) \setminus {\widetilde{D}}|. \end{aligned}$$

Again, the upper bound follows from rescaling the potential F as in the case of the lower bound. This completes the whole proof. \(\square \)

Lemma 1.14

Let \(A,B \subset B_{R}\) be smooth disjoint sets, and let \(x \in B_R\) be such that \(B_\varepsilon (x) \subset B_R \setminus (A \cup B)\), \(\varepsilon \in (0,1)\). Then, if \(F(x;B) \le F(x; A)\), there exists constants q and C such that

$$\begin{aligned} h_{A,B}(x) \le C \varepsilon ^q e^{-(F(x;A)-F(x;B))/\varepsilon }. \end{aligned}$$

Proof

Let \(L:=\Vert \nabla F\Vert _{L^\infty (B_{R})}\). By combining Lemmas 3.1 and 3.2 with \(R = \varepsilon \), \(r = \min \{\varepsilon /L,\varepsilon \}\) yields the result. \(\square \)

Remark 3.4

By relabeling AB to BA and using the fact that \(h_{A,B} = 1-h_{B,A}\), we get that if the reverse inequality holds, i.e. \(F(x;B) > F(x; A)\), then

$$\begin{aligned} 1-h_{A,B}(x) \le C \varepsilon ^q e^{-(F(x;B)-F(x;A))/\varepsilon }. \end{aligned}$$

Lemma 1.16

Let \(\Omega \) be a smooth domain and let \(x_a,x_b \in \Omega \subset B_R\) be two local minimum points of F. Fix \(0< \delta < \delta _1\) and assume that \(U_{-\delta /3} = \{ x: F(x) < F(x_a; x_b) - \delta /3\} \subset \Omega \). Then there exists an \(\varepsilon _0 \in (0,1)\) and a constant \(C = C > 1\) such that, for any \(0 \le \varepsilon \le \varepsilon _0\) for which \(B_{3 \varepsilon }(x_a),B_{3 \varepsilon }(x_b ) \subset U_{-\delta /3}\), the following holds: If \(U_i\) is a component of \(U_{-\delta /3}\), then

$$\begin{aligned} \underset{U_i}{{\text {osc}}\,}\ h_{B_\varepsilon (x_a),B_\varepsilon (x_b)} \le C\varepsilon . \end{aligned}$$

Proof

Consider any component \(U_i\) of \(U_{-\delta /3}\). We note that we can take \(\varepsilon \) small enough depending on the Lipschitz constant of F in \(B_R\) and \(\delta \) such that there exists a Lipschitz domain \(D_i\) satisfying

$$\begin{aligned} U_i + B_\varepsilon \subset D_i \subset U_{-\delta /4}. \end{aligned}$$

For simplicity, denote \(u:=h_{B_\varepsilon (x_a),B_\varepsilon (x_b)}\). Since \(D_i\) is Lipschitz we may use the Poincaré inequality to get

$$\begin{aligned} \int _{D_{i}} |u - u_{D_i}|^2 dx \le C\int _{D_{i}} |\nabla u|^2 dx. \end{aligned}$$

Using that \(D_i \subset U_{-\delta /4}\) together with Lemma 2.4

$$\begin{aligned} \int _{D_{i}} |\nabla u|^2 dx&\le e^{\sup _{D_{i}} F/\varepsilon } \int _{D_{i}} |\nabla u|^2 e^{-F(x)/\varepsilon }dx \\&\le \varepsilon ^{-1} e^{\sup _{D_{i}} F/\varepsilon } {\text {cap}}(B_\varepsilon (x_a),B_\varepsilon (x_b);\Omega ). \end{aligned}$$

Using the definition of \(U_{-\delta /4}\) and Lemma 3.2, we get

$$\begin{aligned} \int _{D_{i}} |u - u_{D_i}|^2 dx \le C \varepsilon ^{q_1} e^{-\delta /4\varepsilon } \end{aligned}$$
(3.7)

for some constant \(q_1 \in {\mathbb {R}}\). Now, for any \(x_0 \in U_i\) we have by Lemma 2.7 that

which together with (3.7) gives

$$\begin{aligned} \sup _{B_\varepsilon (x_0)} |u-u_{D_i}|^2 \le C\varepsilon ^{q_1-n} e^{-\delta /4\varepsilon }. \end{aligned}$$

Since \(x_0\) was an arbitrary point in \(U_i\) we conclude that there exists \(\varepsilon _0 \in (0,1)\) depending only on the data such that if \(\varepsilon < \varepsilon _0\), the claim holds. \(\square \)

We conclude this subsection with an estimate relating the value of the potential of the equilibrium potential to the ratio of the \(L^1\) norm of the equilibrium potential and the capacity.

Lemma 1.17

Let \(\Omega \) be a smooth domain and let \(A \Subset \Omega \) be a smooth open set and consider \(w_{A,\Omega }\) as the potential of the equilibrium potential in \(\Omega \) (see Definition 2.5). Let \(x \in \Omega \) be a critical point of F such that \(B_{3\sqrt{\varepsilon }}(x) \subset \Omega \). Then there exists a constant \(C > 1\) such that for \(\rho < \sqrt{\varepsilon }\) we have

$$\begin{aligned} w_{A,\Omega }(x) - C \bigg ( \frac{\rho }{\sqrt{\varepsilon }}\bigg )^{\alpha } (w_{A,\Omega }(x)+1)&\le \frac{{\displaystyle \int } h_{A,\Omega ^c} h_{B_\rho (x),A} d\mu _\varepsilon }{{\text {cap}}(B_\rho (x),A;\Omega )} \\&\le w_{A,\Omega }(x) + C \bigg ( \frac{\rho }{\sqrt{\varepsilon }}\bigg )^{\alpha } (w_{A,\Omega }(x)+1). \end{aligned}$$

Proof

From Lemma 2.6 we get

$$\begin{aligned} \int _{\partial B_\rho (x)} w_{A,\Omega }(y) e^{-F(y)/\varepsilon } d\mu _{B_\rho (x),A}(y) = \int _{(A \cup B)^c} h_{A,\Omega ^c}(z) h_{B_\rho (x),A}(z) d\mu _\varepsilon . \end{aligned}$$

We can estimate the left hand side as

$$\begin{aligned} w_{A,\Omega }(x) - {\text {osc}}_{B_\rho } w_{A,\Omega }&\le \inf _{B_\rho (x)} w_{A,\Omega } \le \frac{\int _{\partial B_\rho (x)} e^{-F(y)/\varepsilon } w_{A,\Omega } d\mu _{B_\rho (x),A}(y)}{\int _{\partial B_\rho (x)} e^{-F(y)/\varepsilon } d\mu _{B_\rho (x),A}(y)} \\&\le \sup _{B_\rho (x)} w_{A,\Omega } \le w_{A,\Omega }(x) + {\text {osc}}_{B_\rho } w_{A,\Omega }. \end{aligned}$$

We want to estimate the oscillation of \(w_{A,\Omega }\) which we do by considering

$$\begin{aligned} {\text {osc}}w_{A,\Omega } = {\text {osc}}(w_{A,\Omega }+h_{A,\Omega ^c}-h_{A,\Omega ^c}) \le {\text {osc}}(w_{A,\Omega }+h_{A,\Omega ^c}) + {\text {osc}}(h_{A,\Omega ^c}). \end{aligned}$$

Now, the oscillation of \(w_{A,\Omega }+h_{A,\Omega ^c}\) and \(h_{A,\Omega ^c}\) can estimated by Lemma 2.11 for \(\rho \le \frac{1}{C} \sqrt{\varepsilon }\). That is,

$$\begin{aligned} {\text {osc}}_{B_\rho } (w_{A,\Omega }+h_{A,\Omega ^c}) + {\text {osc}}_{B_\rho }(h_{A,\Omega ^c})&\le C \left( \frac{\rho }{\sqrt{\varepsilon } }\right) ^\alpha \sup _{B_{\sqrt{\varepsilon } }}(w_{A,\Omega }+h_{A,\Omega ^c}) \\&\quad + C \left( \frac{\rho }{\sqrt{\varepsilon } }\right) ^\alpha \sup _{B_{\sqrt{\varepsilon } }}(h_{A,\Omega ^c}). \end{aligned}$$

We apply Lemma 2.7 to replace the supremums on the right hand side with the value at x as both \(w_{A,\Omega }+h_{A,\Omega ^c}\) and \(h_{A,\Omega ^c}\) satisfies the Harnack inequality (see Lemma 2.10). That is,

$$\begin{aligned} {\text {osc}}_{B_\rho } (w_{A,\Omega }+h_{A,\Omega ^c}) + {\text {osc}}_{B_\rho }(h_{A,\Omega ^c})&\le C \left( \frac{\rho }{\sqrt{\varepsilon } }\right) ^\alpha (w_{A,\Omega }(x)+h_{A,\Omega ^c}(x)) \\&\le C \left( \frac{\rho }{\sqrt{\varepsilon } }\right) ^\alpha (w_{A,\Omega }(x)+1). \end{aligned}$$

It is easily seen that the above can be extended to \(\rho \le \sqrt{\varepsilon }\) by applying Lemma 2.10 again and by enlarging the constant C. The proof is completed by using (2.6) and collecting the estimates above. \(\square \)

3.2 Laplace asymptotics for log-concave functions

The assumptions (1.4) and (1.6) ensure that near critical points the potential F is well approximated by convex functions. Therefore we will need basic estimates for log-concave functions, which rather surprisingly we did not find in the literature.

Lemma 1.18

Assume \(G: {\mathbb {R}}^n \rightarrow {\mathbb {R}}\) is a convex function which has a proper minimum at the origin and \(G(0)=0\). Then there exists a constant \(C = C(n) > 1\) such that

$$\begin{aligned} \frac{1}{C} |\{G< \varepsilon \}| \le \int _{{\mathbb {R}}^n} e^{-\frac{G}{\varepsilon }} \, dx \le C |\{G < \varepsilon \}|. \end{aligned}$$
(3.8)

Moreover, there is a constant \(C = C(n)\) such that for all \(\Lambda > 1\), we have

$$\begin{aligned} \int _{\{G < \Lambda \varepsilon \}} e^{-\frac{G}{\varepsilon }} \, dx \ge (1- \eta (C \Lambda ^{-1})) \int _{{\mathbb {R}}^n} e^{-\frac{G}{\varepsilon }}\,dx, \end{aligned}$$
(3.9)

with \(\eta \) as in (1.15).

Proof

By approximation we may assume that G is smooth. The lower bound in (3.8) follows immediately from

$$\begin{aligned} \int _{{\mathbb {R}}^n} e^{-\frac{G}{\varepsilon }} \, dx \ge \int _{\{G< \varepsilon \} } e^{-\frac{G}{\varepsilon }} \, dx \ge e^{-1} |\{G < \varepsilon \}|. \end{aligned}$$

To prove the upper bound in (3.8) we first show that, for all \(t>0\), it holds

$$\begin{aligned} |\{G< 2t \}| \le 2^n |\{G < t \}|. \end{aligned}$$
(3.10)

In order to prove (3.10) it is enough to consider only the case \(t = 1\) (the general case follows by considering \({\widetilde{G}} = G/t\)). Denote \(E_1 = \{ G < 1 \}\) and \(E_2 = \{ G < 2 \}\). Hence our goal is to show

$$\begin{aligned} E_2 \subset 2 E_1 = \{ 2 x: x \in E_1\}. \end{aligned}$$

Fix \({\hat{x}} \in \partial E_1\) and define \(g(t) = G(t {\hat{x}}) \) for \(t \ge 0\). By our assumptions, g(t) is a smooth convex function satisfying \(g(0)=0\) and \(g(1)=1\). As such, both g, \(g'\) are increasing functions from which we can conclude that \(g'(1) \ge 1\). Now, by the fundamental theorem of calculus,

$$\begin{aligned} g(2) - g(1) = \int _1^2 g'(t) \, dt \ge 1 \end{aligned}$$

which gives \(g(2) \ge 2\). This means that for all \({\hat{x}} \in \partial E_1\) we have \(G(2{\hat{x}}) \ge 2\). That is, we have \(E_2 \subset 2 E_1\). Thus

$$\begin{aligned} |E_2| \le |2 E_1| \le 2^n |E_1| \end{aligned}$$

and (3.10) follows. Iterating (3.10) gives

$$\begin{aligned} |\{G< 2^j \varepsilon \}| \le 2^{jn} |\{G < \varepsilon \}| \end{aligned}$$

and hence

$$\begin{aligned} |\{G< \varrho \varepsilon \}| \le (2\varrho )^n |\{G < \varepsilon \}| \end{aligned}$$
(3.11)

for all \(\varrho \ge 1\). We conclude the proof of the upper bound in (3.8) by using (3.11) as

$$\begin{aligned} \int _{{\mathbb {R}}^n} e^{-\frac{G}{\varepsilon }} \, dx&\le \sum _{j=0 }^\infty \int _{\{ j\varepsilon \le G< (j+1) \varepsilon \}} e^{-\frac{G}{\varepsilon }} \, dx \\&\le \sum _{j=0 }^\infty |\{ j \varepsilon \le G< (j+1)\varepsilon \} | e^{-j} \\&\le 2^n \sum _{j=0 }^\infty e^{-j} (j+1)^n |\{G< \varepsilon \}| \le C(n) |\{G < \varepsilon \}|. \end{aligned}$$

It remains to prove (3.9). Fix \(\Lambda >1\). Then, for every \(x \in \{ G \ge \Lambda \varepsilon \}\), it holds

$$\begin{aligned} e^{-\frac{G(x)}{\varepsilon }} = e^{-\frac{\Lambda G(x)}{\Lambda \varepsilon }} = \left( e^{-\frac{G(x)}{\Lambda \varepsilon }} \right) ^\Lambda = \left( e^{-\frac{G(x)}{\Lambda \varepsilon }} \right) ^{\Lambda -1} e^{-\frac{G(x)}{\Lambda \varepsilon }} \le e^{-\Lambda +1} e^{-\frac{G(x)}{\Lambda \varepsilon }}. \end{aligned}$$
(3.12)

Therefore we have, by (3.8), (3.11) and (3.12),

$$\begin{aligned} \begin{aligned} \int _{\{G \ge \Lambda \varepsilon \}} e^{-\frac{G}{\varepsilon }} \, dx&\le e^{-\Lambda +1} \int _{\{G \ge \Lambda \varepsilon \}} e^{-\frac{G}{ \Lambda \varepsilon }} \, dx \le e^{-\Lambda +1} \int _{{\mathbb {R}}^n } e^{-\frac{G}{ \Lambda \varepsilon }} \, dx\\&\le Ce^{-\Lambda } |\{ G< \Lambda \varepsilon \} | \\&\le C e^{-\Lambda } \Lambda ^n |\{ G < \varepsilon \} | \\&\le C e^{-\Lambda } \Lambda ^n \int _{{\mathbb {R}}^n } e^{-\frac{G}{\varepsilon }} \, dx \end{aligned} \end{aligned}$$
(3.13)

and the inequality (3.9) follows by using (1.15). \(\square \)

Lemma 1.19

Assume \(G: {\mathbb {R}}^n \rightarrow {\mathbb {R}}\) is a function which has a proper global minimum at the origin and \(G(0)=0\). Furthermore, assume there is a constant \(C_0\) such that, for all \(a > 0\) and \(\varepsilon > 0\), it holds that

$$\begin{aligned} \int _{G > a} e^{-G/\varepsilon } dx < C_0 e^{-a/\varepsilon }. \end{aligned}$$
(3.14)

If there is a level \(\varepsilon _0 > 0\) such that G is convex on the component of \(\{G(x) < \varepsilon _0\}\) that contains 0, then there is an \(\varepsilon _1(n,|\{G< \varepsilon _0/2\}|) < \varepsilon _0\) and a constant \(C=C(C_0,n) > 1\) such that, for all \(\varepsilon < \varepsilon _1\), it holds that

$$\begin{aligned} C^{-1} |\{G< \varepsilon \}| \le \int _{{\mathbb {R}}^n} e^{-\frac{G}{\varepsilon }} \, dx \le C |\{G < \varepsilon \}|. \end{aligned}$$

Proof

Since G is convex in the level set \(\{G < \varepsilon _0\}\), we know that the level set \(\{G \le \varepsilon _0/2\}\) is convex and as such we can extend the function G outside that level set to a globally convex function. This allows us to apply Lemma 3.7 and obtain

$$\begin{aligned} \frac{1}{C} |\{G< \varepsilon \}| \le \int _{\{G< \varepsilon _0/2\}} e^{-\frac{G}{\varepsilon }} \, dx \le C |\{G < \varepsilon \}|. \end{aligned}$$
(3.15)

Now, split the integral as

$$\begin{aligned} \int e^{-G/\varepsilon } dx = \int _{G \le \varepsilon _0/2} e^{-G/\varepsilon } dx + \int _{G > \varepsilon _0/2} e^{-G/\varepsilon } dx. \end{aligned}$$

From (3.15) it follows that it suffices to bound the second integral on the right hand side. Using (3.14) for \(a = \varepsilon _0/2\) we get

$$\begin{aligned} \int _{G > \varepsilon _0/2} e^{-G/\varepsilon } dx \le C_0 e^{-(\varepsilon _0/2)/\varepsilon }. \end{aligned}$$

Since G is convex in the level set \(\{G < \varepsilon _0/2\}\), which is again convex, it follows that we can construct a conical function \({\widetilde{G}}\) as follows: For any \({\hat{x}} \in \partial \{G < \varepsilon _0/2\}\) define \({\widetilde{G}}(t {\hat{x}}/\Vert {\hat{x}}\Vert ) = t \varepsilon _0/2\). The level sets of \({\widetilde{G}}\) satisfy, for \(\varepsilon < \varepsilon _0/2\),

$$\begin{aligned} |\{{\widetilde{G}}< \varepsilon \}| \le |\{G < \varepsilon \}|. \end{aligned}$$

However,

$$\begin{aligned} |\{{\widetilde{G}}< \varepsilon \}| = \left( \frac{\varepsilon }{\varepsilon _0/2}\right) ^n |\{{\widetilde{G}}< \varepsilon _0/2\}| = \left( \frac{\varepsilon }{\varepsilon _0/2}\right) ^n |\{G < \varepsilon _0/2\}|. \end{aligned}$$

Now, we can choose \(\varepsilon _1(n,C_0,|\{G< \varepsilon _0/2\}|) < \varepsilon _0/2\) such that for \(\varepsilon < \varepsilon _1\) we have

$$\begin{aligned} e^{-(\varepsilon _0/2)/\varepsilon } \le \left( \frac{\varepsilon }{\varepsilon _0}\right) ^n |\{G < \varepsilon _0/2\}|. \end{aligned}$$

This means that for \(\varepsilon < \varepsilon _1\) we also have

$$\begin{aligned} \int _{G > \varepsilon _0/2} e^{-G/\varepsilon } dx \le C_0 |\{G < \varepsilon \}| \end{aligned}$$

which together with (3.15) completes the proof. \(\square \)

We conclude this section with the following technical lemma which is useful when we study the potential near critical points.

Lemma 1.20

Assume \(G: {\mathbb {R}}^n \rightarrow {\mathbb {R}}\) is a convex function which has a proper minimum at the origin and \(G(0)=0\). Let \(\omega : [0,\infty ) \rightarrow [0,\infty )\) be as in (1.4) and (1.6). Then for all \(\delta \le \delta _0\), we have

$$\begin{aligned} \int _{\{ G < \delta \} } e^{-\frac{G(x)}{\varepsilon }} e^{ \frac{\pm \omega (G(x))}{\varepsilon }} \, dx \simeq \int _{{\mathbb {R}}^n} e^{-\frac{G(x)}{\varepsilon }} \, dx. \end{aligned}$$

Proof

Denote \(\Lambda _\varepsilon = \frac{\varepsilon _1}{\varepsilon }\) with \(\varepsilon _1\) as in (1.13). From (1.14) we know that \(\Lambda _\varepsilon \rightarrow \infty \) as \(\varepsilon \rightarrow 0\). Now, by (3.9) in Lemma 3.7 and (1.14), we get

$$\begin{aligned} \int _{\{G< \varepsilon _1\}} e^{-\frac{G(x)}{\varepsilon }} e^{\frac{\omega (G(x))}{\varepsilon }} \, dx \simeq \int _{\{G < \Lambda _\varepsilon \varepsilon \}} e^{-\frac{G(x)}{\varepsilon }} \, dx \simeq \int _{{\mathbb {R}}^n} e^{-\frac{G(x)}{\varepsilon }} \, dx. \end{aligned}$$
(3.16)

The lower bound follows immediately from this. In order to prove the upper bound, note that \(\omega (s) \le s/2\) for all \(s \le \delta _0\) by assumption. Therefore we can repeat the argument in (3.13) to get

$$\begin{aligned} \int _{\{ \varepsilon _1< G < \delta \} } e^{-\frac{G(x)}{\varepsilon }} e^{\frac{\omega (G(x))}{\varepsilon }} \, dx \le \int _{\{ G > \Lambda _\varepsilon \varepsilon \} } e^{-\frac{G(x)}{2\varepsilon }} \, dx \le \eta (C \Lambda _\varepsilon ^{-1}) \int _{{\mathbb {R}}^n } e^{-\frac{G}{\varepsilon }} \, dx, \end{aligned}$$

which together with Lemma 3.16 yields the upper bound. \(\square \)

4 Proofs of Theorems 1 and 2

In this section we prove the capacity estimate in Theorem 1 and exit time estimate in Theorem 2. Before we begin, we would like to remind the reader that, as in Sect. 3, we will assume that F satisfies our structural assumptions and that all constants depend on the data, see the paragraph after (1.16).

We first study the geometric quantities \(d_\varepsilon (A,B;\Omega )\) and \(V_\varepsilon (A,B;\Omega )\) defined in (1.8) and (1.9) and give a more explicit, but less geometric, characterization. The characterization for the geodesic distance \(d_\varepsilon (A,B;\Omega )\) turns out to be much easier than for the separating surface \(V_\varepsilon (A,B;\Omega )\) and therefore we prove it first.

In the following two propositions we will first fix two local minimum points of F, say \(x_a\) and \(x_b\). Their communication height \(F(x_a;x_b)\) defines the island \(U_{-\delta /3}\) which we recall are components of \(\{ F < F(x_a;x_b) - \delta /3\}\). We first study the geodesic distance between \(B_{\varepsilon }(x_1)\) and \(B_{\varepsilon }(x_2)\), where \(x_1\) and \(x_2\) are two local minima of F.

Proposition 1.21

Let us fix local minimum points \(x_a\) and \(x_b\) of F. Let \(U_{x_1}\) and \(U_{x_2}\) be the islands, i.e., the components of the set \(U_{-\delta /3} = \{ F < F(x_a;x_b) - \delta /3\}\), containing \(B_\varepsilon (x_1)\) and \(B_\varepsilon (x_2)\) respectively, where \(x_1\) and \(x_2\) are two (possibly different) local minimum points. Assume that z is a saddle point in \(Z_{x_a,x_b}\) such that the bridge \(O_{z, \delta }\) connects \(U_{x_1}\) and \(U_{x_2}\), and denote \(\Omega = U_{x_1} \cup U_{x_2} \cup O_{z, \delta }\). Then it holds for \(g_z\) given in (1.6) that

$$\begin{aligned} d_{\varepsilon }(B_\varepsilon (x_1),B_\varepsilon (x_2); \Omega ) \simeq e^{\frac{F(z)}{\varepsilon }} \int _{{\mathbb {R}}} e^{-\frac{g_z(y_1)}{\varepsilon }} \, dy_1. \end{aligned}$$

Proof

Denote \(g = g_z\). We begin by proving the lower bound, i.e.,

$$\begin{aligned} d_{\varepsilon }(B_\varepsilon (x_1),B_\varepsilon (x_2); \Omega ) \ge (1- {\hat{\eta }}(C, \varepsilon )) e^{\frac{F(z)}{\varepsilon }} \int _{{\mathbb {R}}} e^{-\frac{g(y_1)}{\varepsilon }} \, dy_1. \end{aligned}$$
(4.1)

To this aim we choose a smooth curve \(\gamma \in {\mathcal {C}}(B_\varepsilon (x_1),B_\varepsilon (x_2); \Omega )\) which, by assumptions, intersects the bridge \(O_{z, \delta }\). We may choose the coordinates in \({\mathbb {R}}^n\) such that \(z= 0\) and

$$\begin{aligned} O_{\delta } = O_{z, \delta } = \{ y_1:g(y_1)< \delta \} \times \{ y' \in {\mathbb {R}}^{n-1}: G(y') < \delta \}. \end{aligned}$$

Let \(\tau \in {\mathbb {R}}\) be such that \(g(\tau ) < \frac{\delta }{100}\) and denote \(S_\tau = \{ \tau \} \times \{ G < \delta \} \subset O_\delta \). Let us next show that \(S_\tau \in {\mathcal {S}}(B_\varepsilon (x_1),B_\varepsilon (x_2); \Omega )\). Since \(0 \in Z_{x_a,x_b}\) then it follows that \(|F(0) - F(x_a;x_b)| < \delta /3\). Now, consider a narrow cylinder of the form \({\hat{O}}_\delta =\{g(y_1)< \frac{\delta }{100}\} \times \{G(y') < \delta \} \subset O_\delta \), then, any surface of the form \(S(\tau )\) for \(\tau \) such that \(g(\tau ) < \delta /100\) lies inside \({\hat{O}}_\delta \), and is parallel to the bottom/top of the cylinder. Thus, by the assumption (1.6) and \(|F(0) - F(x_a;x_b)| < \delta /3\), we know that the cylindrical part of the boundary (\(\{g < \delta /100\} \times \{ G = \delta \}\)) does not intersect \(U_{x_1} \cup U_{x_2}\). Therefore any curve in \({\mathcal {C}}(B_\varepsilon (x_1),B_\varepsilon (x_2);\Omega )\) will pass through \(S(\tau )\). In conclusion, \(S_\tau \in {\mathcal {S}}(B_\varepsilon (x_1),B_\varepsilon (x_2); \Omega )\) for \(g(\tau ) < \frac{\delta }{100}\).

Let us fix \(\gamma \in {\mathcal {C}}(B_\varepsilon (x_1),B_\varepsilon (x_2); \Omega ))\) and denote the projection to the \(y_1\)-axis by \(\pi _1: {\mathbb {R}}^n \rightarrow {\mathbb {R}}\), i.e. \(\pi _1(y) = y_1\). Since \(S_\tau \in {\mathcal {S}}(B_\varepsilon (x_1),B_\varepsilon (x_2); \Omega )\), \(\gamma \) intersects \(S_\tau \). Thus we conclude that \(\tau \in \pi _1\big (\gamma ([0,1]) \cap O_\delta \big )\). This holds for every \(\tau \in \{g < \delta /100\}\) and therefore

$$\begin{aligned} \{g < \delta /100\} \subset \pi _1\big (\gamma ([0,1]) \cap O_\delta \big ). \end{aligned}$$
(4.2)

Now the assumption (1.6) implies that, in the set \(O_{ \delta }\), it holds that

$$\begin{aligned} \begin{aligned} F(y) -F(0)&\ge -g(y_1) - \omega (g(y_1)) + G(y') -\omega (G(y')) \\&\ge -g(y_1) - \omega (g(y_1)) +\frac{1}{2} G(y') \ge -g(y_1) - \omega (g(y_1)). \end{aligned} \end{aligned}$$
(4.3)

Then for \(\gamma _1 = \pi _1(\gamma )\) we have by (4.2), (4.3) and Lemma 3.9 that

$$\begin{aligned} \int _{\{t:\gamma (t) \in O_{ \delta }\}} |\gamma '| e^{\frac{F(\gamma )}{\varepsilon }} \, dt&\ge e^{\frac{F(0)}{\varepsilon }}\int _{\{t:\gamma (t) \in O_{ \delta }\}} |\gamma _1'| e^{\frac{-g(\gamma _1)}{\varepsilon }}e^{\frac{-\omega (g(\gamma _1))}{\varepsilon }} \, dt \\&\ge e^{\frac{F(0)}{\varepsilon }}\int _{\{ g < \delta /100\}} e^{\frac{-g(y_1)}{\varepsilon }}e^{-\frac{\omega (g(y_1))}{\varepsilon }} \, dy_1\\&\ge (1- \eta (C \varepsilon ))e^{\frac{F(0)}{\varepsilon }} \int _{{\mathbb {R}}} e^{-\frac{g(y_1)}{\varepsilon }} \, dy_1, \end{aligned}$$

proving (4.1).

To prove the upper bound, i.e.

$$\begin{aligned} d_{\varepsilon }(B_\varepsilon (x_1),B_\varepsilon (x_1); \Omega ) \le (1+ {\hat{\eta }}(C,\varepsilon )) \int _{{\mathbb {R}}} e^{-\frac{g(y_1)}{\varepsilon }} \, dy_1 \end{aligned}$$

with \({\hat{\eta }}\) as in (1.16), we denote by \(\tau _-< 0 < \tau _+\) the numbers such that \(g(\tau _-) = g(\tau _+) = \delta \). We first connect the points \(x_3 = (\tau _-, 0)\) and \(x_4 = (\tau _+, 0)\) by a segment \(\gamma _0(t) = tx_4 +(1-t)x_3\). Then it holds by the assumption (1.6) and by Lemma 3.9 that

$$\begin{aligned} \int _{\gamma _0} e^{\frac{F(\gamma _0)}{\varepsilon }} \, dt \le e^{\frac{F(0)}{\varepsilon }} \int _{\{g < \delta \}} e^{\frac{-g(y_1)}{\varepsilon }}e^{\frac{\omega (g(y_1))}{\varepsilon }} \, dy_1 \le (1+\eta (C\Lambda _\varepsilon ^{-1})) e^{\frac{F(0)}{\varepsilon }} \int _{{\mathbb {R}}} e^{-\frac{g(y_1)}{\varepsilon }} \, dy_1. \end{aligned}$$

We then connect the points \(x_3, x_4\) to \(x_1, x_2\) with smooth curves \(\gamma _1, \gamma _2 \subset \{x \in \Omega : F< F(0) -\delta /3\}\). Since it holds \(g(t) \le C|t|\) we have \(|\{g <\varepsilon \}|\ge c \, \varepsilon \). Therefore it holds by Lemma 3.7 that

$$\begin{aligned} \int _{\gamma _i} |\gamma _i'|e^{\frac{F(\gamma _i)}{\varepsilon }} \, dt&\le e^{\frac{F(0)}{\varepsilon }} e^{\frac{-\delta }{3\varepsilon }} \int _{\gamma _i} |\gamma _i'|\, dt \le C e^{\frac{F(0)}{\varepsilon }} e^{\frac{-\delta }{3\varepsilon }} \\&\le e^{\frac{F(0)}{\varepsilon }} {\hat{\eta }}(C, \varepsilon ) |\{g <\varepsilon \}| \le e^{\frac{F(0)}{\varepsilon }} {\hat{\eta }}(C,\varepsilon ) \int _{{\mathbb {R}}} e^{-\frac{g(y_1)}{\varepsilon }} \, dy_1. \end{aligned}$$

The constant in the last expression depends on the length of \(\gamma _i\). We can use a similar argument as in the proof of Lemma 3.2 to bound the length of the curve. This time, we will however consider coverings with balls of size comparable to \(\delta \). As we are in the level set \(\{F < F(0) -\delta /3\}\) we have some room to replace our curve with another curve which has a length depending on \(\delta \) and R, while still retaining the same upper bound as above.

The upper bound now follows by joining the paths \(\gamma _1, \gamma _0\) and \(\gamma _2\), thus, constructing a competitor for the geodesic length. \(\square \)

We need to prove similar result to Proposition 4.1, but for the minimal cut. This turns out to be trickier than the previous result for paths.

Proposition 1.22

Assume that \(x_a,x_b,z, U_{x_1},U_{x_2},O_{z, \delta }\) and \(\Omega \) are as in Proposition 4.1. Then it holds for \(G_z\) from (1.6) that

$$\begin{aligned} V_{\varepsilon }(B_\varepsilon (x_1),B_\varepsilon (x_2);\Omega ) \simeq e^{-\frac{F(z)}{\varepsilon }} \int _{{\mathbb {R}}^{n-1}} e^{-\frac{G_z(y')}{\varepsilon }} \, dy'. \end{aligned}$$

Proof

Denote \(G_z = G\) for short. As in the proof of Proposition 4.1 we may assume that \(z = 0\) and that

$$\begin{aligned} O_{\delta } = O_{z, \delta } = \{ y_1:g(y_1)< \delta \} \times \{ y' \in {\mathbb {R}}^{n-1}: G(y') < \delta \}. \end{aligned}$$

Let us begin by proving the upper bound. In the proof of Proposition 4.1 we already observed that the surface \(S_0 = \{ 0\} \times \{ G < \delta \} \) is in the family of separating surfaces \(S_0 \in {\mathcal {S}}(B_\varepsilon (x_1),B_\varepsilon (x_2); \Omega )\). Therefore the assumption (1.6) and Lemma 3.9 together with the definition of \(V_\varepsilon \) imply

$$\begin{aligned} \begin{aligned} V_{\varepsilon }(B_\varepsilon (x_1),B_\varepsilon (x_2);\Omega )&\le \int _{S_0 } e^{-\frac{F}{\varepsilon }} \, d{\mathcal {H}}^{n-1} \le e^{-\frac{F(0)}{\varepsilon }}\int _{\{ G < \delta \} } e^{-\frac{G}{\varepsilon }} e^{\frac{\omega (G)}{\varepsilon }} \, dy' \\&\simeq e^{-\frac{F(0)}{\varepsilon }} \int _{{\mathbb {R}}^{n-1}} e^{-\frac{G(y')}{\varepsilon }} \, dy' . \end{aligned} \end{aligned}$$

The upper bound follows directly from this. Moreover by Lemma 3.7 it holds that

$$\begin{aligned} V_{\varepsilon }(B_\varepsilon (x_1),B_\varepsilon (x_2);\Omega ) \le C e^{-\frac{F(0)}{\varepsilon }}|\{ G < \varepsilon \}| \le C e^{-\frac{F(0)}{\varepsilon }}. \end{aligned}$$
(4.4)

In order to prove the lower bound we fix a small \(t>0\) and choose a smooth hypersurface \(S \in {\mathcal {S}}(B_\varepsilon (x_1),B_\varepsilon (x_2);\Omega )\) such that

$$\begin{aligned} \int _{S} e^{-\frac{F}{\varepsilon }} \, d{\mathcal {H}}^{n-1} \le V_{\varepsilon }(B_\varepsilon (x_1),B_\varepsilon (x_2);\Omega ) + t. \end{aligned}$$

Then S divides the domain \(\Omega \) into two different components, from which we denote the component containing \(x_1\) by \({\hat{U}}_{x_1}\). Note that then \(\partial {\hat{U}}_{x_1} \cap \Omega \subset S\). Denote \(\rho = \varepsilon ^2\). We use an idea from [16] and instead of studying the set \({\hat{U}}_{x_1}\), we study the density

$$\begin{aligned} v_\rho (x):= \frac{|B_\rho (x) \cap {\hat{U}}_{x_1}|}{|B_\rho |} \end{aligned}$$

which can be written as a convolution, \(v_\rho (x) = \frac{1}{|B_\rho |} ( \chi _{ {\hat{U}}_{x_1}} * \chi _{B_\rho } )\). To see why studying \(v_\rho \) is relevant, we need some setup that we will present next.

We choose a subset \({\hat{O}}\) of the bridge \(O_{\delta }\) as

$$\begin{aligned} {\hat{O}}:= \{ x_1:g(y_1)< \delta \} \times \{ y' \in {\mathbb {R}}^{n-1}: G(y') < \delta /100\}, \end{aligned}$$

see Fig. 3, and denote its bottom/top boundaries by \(\Gamma _{-} \) and \(\Gamma _{+}\), i.e.,

$$\begin{aligned} \{g= \delta \} \times \{ G < \delta /100\} = \Gamma _{-} \cup \Gamma _{+}. \end{aligned}$$

Now since \(0 \in Z_{x_a,x_b}\), we have \(|F(0) - F(x_a;x_b)| < \delta /3\). Using this and (1.6) we deduce that \(\Gamma _{-} \cup \Gamma _{+} \subset \{F < F(x_a;x_b) -\delta /3\}\) and \(\Gamma _{-} \cup \Gamma _{+} \subset \{F < F(0) -\delta /3\}\), i.e.,

$$\begin{aligned} \Gamma _{-} \cup \Gamma _{+} \subset U_{x_1} \cup U_{x_2} \cap \{F < F(0) -\delta /3\}. \end{aligned}$$
(4.5)

Moreover, by relabeling we may assume that \(\Gamma _{+} \subset U_{x_1} \) and \(\Gamma _{-} \subset U_{x_2}\). Furthermore, by the Lipschitz-continuity of F we have \(|F(x) -F(y)| \le c \varepsilon ^2\) for all \( y \in B_\rho (x)\). Note also that for all \(x \in {\hat{O}}\) and \(y \in B_\rho (x)\) it holds \(x-y \in \Omega \).

We will now relate \(v_\rho \) to the surface integral of S as follows: Recall that the set \({\hat{U}}_{x_1}\) has smooth boundary in \(\Omega \) and thus its characteristic function is a BV-function. In particular, the derivative \(|\nabla \chi _{{\hat{U}}_{x_1}}|\) is a Radon measure in \(\Omega \) and

$$\begin{aligned} \int _{\Omega } |\nabla \chi _{{\hat{U}}_{x_1}}| e^{-\frac{F}{\varepsilon }}\, dx = \int _{\partial {\hat{U}}_{x_1} \cap \Omega } e^{-\frac{F}{\varepsilon }}\, d{\mathcal {H}}^{n-1} \le \int _{S} e^{-\frac{F}{\varepsilon }}\, d{\mathcal {H}}^{n-1}. \end{aligned}$$
(4.6)

Using the definition of \(v_\rho \) and the Lipschitzness of F inside \(B_\rho (x)\), we may thus estimate

$$\begin{aligned} \begin{aligned} \int _{{\hat{O}}} |\nabla v_\rho | e^{-\frac{F(x)}{\varepsilon }} \, dx&\le \frac{1}{|B_\rho |} \int _{{\hat{O}}} e^{-\frac{F(x)}{\varepsilon }} \int _{{\mathbb {R}}^n} |\nabla \chi _{ {\hat{U}}_{x_1}}(y) | |\chi _{B _\rho }(x-y)| \,dy dx \\&\le \frac{1}{|B_\rho |} \int _{{\hat{O}}} e^{c \varepsilon }\int _{{\mathbb {R}}^n} e^{-\frac{F(y)}{\varepsilon }} |\nabla \chi _{ {\hat{U}}_{x_1}}(y) | |\chi _{B _\rho }(x-y)| \,dy dx \\&\le (1+C\varepsilon ) \int _{\Omega } |\nabla \chi _{{{\hat{U}}_{x_1}}}(y)| e^{-\frac{F(y)}{\varepsilon }}\, dy. \end{aligned} \end{aligned}$$
(4.7)

Putting together (4.6) and (4.7) we see that it is enough to establish a lower bound on the integral of \(|\nabla v_\rho |\) in \({\hat{O}}\). In order to achieve this, we first claim that for all x such that \(B_\rho (x) \subset \Omega \cap \{F < F(0) -\delta /3\}\) we have, when \(\varepsilon \) is small,

$$\begin{aligned} \begin{aligned} v_\rho (x)&\ge 1- C \varepsilon \qquad \text { for }\, x \in U_{x_1} \cap \{F< F(0) -\delta /3\} \quad \text {and } \\ v_\rho (x)&\le C \varepsilon \qquad \text { for }\, x \in U_{x_2} \cap \{F < F(0) -\delta /3\}. \end{aligned} \end{aligned}$$
(4.8)

We now complete the proof of the lower bound, using (4.8), and postpone the proof of (4.8) to the end. Assume now that (4.8) holds. Then we can use the fundamental theorem of calculus and (4.5) to get that for all \(y' \in \{ G < \delta /100\}\)

$$\begin{aligned} 1- 2C \varepsilon \le \int _{\{g < \delta \}} \partial _{y_1} v_\rho (y_1,y')\, dy_1. \end{aligned}$$
(4.9)

Now, arguing as in (4.3) we conclude that

$$\begin{aligned} F(y) -F(0) \le G(y') + \omega (G(y') )\qquad \text {for } \, y \in O_{\delta }. \end{aligned}$$
(4.10)

Multiplying and dividing with \(e^{-\frac{F(y)}{\varepsilon }}\) inside the integral in (4.9) and using (4.10) we get

$$\begin{aligned} (1- 2C \varepsilon ) e^{-F(0)} e^{\frac{- G(y')-\omega (G(y'))}{\varepsilon }} \le \int _{\{G< \delta /100\}} \int _{\{g < \delta \}} |\partial _{y_1} v_\rho (y_1,y')| e^{-F(y)/\varepsilon } \, dy_1. \end{aligned}$$

Integrating over \(y' \in \{G < \delta /100\}\) we obtain

$$\begin{aligned} \begin{aligned}&(1- 2C \varepsilon ) e^{-F(0)} \int _{\{ G< \delta /100\}} e^{-\frac{G(y')}{\varepsilon }}e^{-\frac{\omega (G(y'))}{\varepsilon }} \, dy' \\&\quad \le \int _{\{ G< \delta /100\}} \int _{\{g < \delta \}} |\partial _{y_1} v_\rho (y_1,y')| e^{-\frac{F(y)}{\varepsilon }} \, dy_1 dy' \\&\quad \le \int _{{\hat{O}}} |\nabla v_\rho | e^{-\frac{F(y)}{\varepsilon }} \, dx. \end{aligned} \end{aligned}$$

The lower bound on the integral on the right hand side follows by Lemma 3.9, i.e. we have

$$\begin{aligned} (1- \eta (\varepsilon )) e^{-F(0)} \int _{{\mathbb {R}}^{n-1}} e^{-\frac{G(y')}{\varepsilon }} \, dy' \le \int _{{\hat{O}}} |\nabla v_\rho | e^{-\frac{F(y)}{\varepsilon }} \, dy. \end{aligned}$$
(4.11)

Now, assuming (4.8), we may use (4.6), (4.7) and (4.11) to get the lower bound from

$$\begin{aligned} \begin{aligned} (1- \eta (\varepsilon )) e^{-F(0)} \int _{{\mathbb {R}}^{n-1}} e^{-\frac{G(y')}{\varepsilon }} \, dy'&\le (1+C\varepsilon ) \int _{S} e^{-\frac{F}{\varepsilon }}\, d{\mathcal {H}}^{n-1} \\&\le (1+C\varepsilon ) \big (V_{\varepsilon }(B_\varepsilon (x_1),B_\varepsilon (x_2);\Omega ) + t\big ) \end{aligned} \end{aligned}$$

as t is arbitrarily small. Thus we obtain the lower bound, and hence in order to complete the proof it remains to prove (4.8).

To this aim, we fix \(x \in U_{x_1} \cup U_{x_2} \cap \{ F < F(0) - \delta /3\}\) such that \(B_\rho (x) \subset \Omega \). By the relative isoperimetric inequality (See for instance [2, Theorem 3.40] or [18]) and by \(\rho = \varepsilon ^2\) it holds

$$\begin{aligned} \begin{aligned} {\mathcal {H}}^{n-1}\big (\partial {\hat{U}}_{x_1} \cap B_\rho (x)\big )&\ge c \min \big \{ |B_\rho (x) \cap {\hat{U}}_{x_1}|^{\frac{n-1}{n} } , |B_\rho (x) \setminus {\hat{U}}_{x_1}|^{\frac{n-1}{n} } \big \}\\&\ge c\, \varepsilon ^{2(n-1)} \min \big \{ v_\rho (x) , 1- v_\rho (x) \big \}^{\frac{n-1}{n}}. \end{aligned} \end{aligned}$$

On the other hand, since \(x \in \{ F < F(0)-\delta /3\}\) and thus \( B_\rho (x) \subset \{ F < F(0) -\delta /4 \}\), we have by (4.4) that

$$\begin{aligned} \begin{aligned} {\mathcal {H}}^{n-1}\big (\partial {\hat{U}}_{x_1}\cap B_\rho (x)\big )&\le e^{\frac{F(0)}{ \varepsilon }} e^{-\frac{\delta }{4 \varepsilon }} \int _{\partial {\hat{U}}_{x_1} \cap B_\rho (x)} e^{-\frac{F}{\varepsilon }} \, d{\mathcal {H}}^{n-1} \\&\le e^{\frac{F(0)}{ \varepsilon }} e^{-\frac{\delta }{4 \varepsilon }} \int _{S} e^{-\frac{F}{\varepsilon }} \, d{\mathcal {H}}^{n-1} \\&\le e^{\frac{F(0)}{ \varepsilon }} e^{-\frac{\delta }{4 \varepsilon }} \big (V_{\varepsilon }(B_\varepsilon (x_1),B_\varepsilon (x_2);\Omega ) + t\big ) \\&\le C e^{-\frac{\delta }{4 \varepsilon }}. \end{aligned} \end{aligned}$$

By combining the two inequalities above we obtain (4.8) which completes the whole proof. \(\square \)

Fig. 3
figure 3

The bridge \(O_{\delta }\) connects the sets \(U_{x_1}\) and \(U_{x_2}\). The smaller cyldindrical bridge \({\hat{O}}\) has its top and bottom inside \(U_{x_1} \cup U_{x_2}\)

4.1 Proof of Theorem 1

We consider parallel case and series case separately.

4.1.1 Parallel case

Assume that the saddle points in \(F_{x_a,x_b} = \{ z_1, \dots , z_N\}\) are parallel, see Fig. 2. Let us fix a saddle point \(z_i \in F_{x_a,x_b}\) and recall the definition of the bridge \(O_{z_i,\delta }\) in (1.7). By considering \(F-F(z_i)\) instead of F, we may assume

$$\begin{aligned} z_i = 0, \quad F(0) = 0 \end{aligned}$$

and

$$\begin{aligned} O_{\delta }:= O_{0,\delta } = \{ x_1 \in {\mathbb {R}}: g(x_1)< \delta \} \times \{ x' \in {\mathbb {R}}^{n-1}: G(x') < \delta \}. \end{aligned}$$

We also recall the notation \(U_{-\delta /3} = \{ F < F(x_a;x_b) -\delta /3\}\). We denote the island, i.e., the component of \(U_{-\delta /3}\), which contains the point \(x_a\) by \({U_{x_a}}\) and the island which contains the point \(x_b\) by \({U_{x_b}}\). Since the saddle points are parallel, the bridge \(O_{\delta }\) connects the islands \({U_{x_a}}\) and \({U_{x_b}}\) which means that the set \(\Omega = {U_{x_a}} \cup {U_{x_b}} \cup O_{\delta } \) is open and connected. By Lemma 3.5 we have \(\text {osc}_{{U_{x_a}}}(h_{A,B}) + \text {osc}_{{U_{x_b}}}(h_{A,B}) \le C \varepsilon \). Since \(h_{A,B} = 1\) in \(B_\varepsilon (x_a) \subset {U_{x_a}}\) and \(h_{A,B} = 0\) in \(B_\varepsilon (x_b) \subset {U_{x_b}}\), it follows that

$$\begin{aligned} h_{A,B} \ge 1- C\varepsilon \,\, \text {in } \, {U_{x_a}} \quad \text {and} \quad h_{A,B} \le C\varepsilon \,\, \text {in } \, {U_{x_b}}. \end{aligned}$$
(4.12)

Let us choose a subset \({\hat{O}}\) of the bridge \(O_{\delta }\) as in the proof of Proposition 4.2 (see Fig. 3), i.e.

$$\begin{aligned} {\hat{O}}:= \{ x_1:g(x_1)< \delta \} \times \{ x' \in {\mathbb {R}}^{n-1}: G(x') < \delta /100\}, \end{aligned}$$

and denote its bottom/top boundaries by \(\Gamma _{-} \subset \{ x_1 <0\}\) and \(\Gamma _{+}\subset \{ x_1 >0\} \), i.e.,

$$\begin{aligned} \{g = \delta \} \times \{ G < \delta /100\} = \Gamma _{-} \cup \Gamma _{+}. \end{aligned}$$

Then by using (1.6) and arguing as in the proof of Proposition 4.2 we deduce that \(\Gamma _{-}, \Gamma _{+} \subset \{ F < F(x_a;x_b) -\delta /3\}\) and we may assume \(\Gamma _{+} \subset {U_{x_a}}\) and \(\Gamma _{-} \subset {U_{x_b}}\). Therefore, by (4.12), we have that \(h_{A,B} \le C\varepsilon \) on \(\Gamma _- \) and \(h_{A,B} \ge 1- C\varepsilon \) on \(\Gamma _+\). Now, by the fundamental theorem of calculus and Cauchy-Schwarz inequality, it holds that

$$\begin{aligned} 1- 2C\varepsilon&\le \int _{\{ g< \delta \}} \partial _{x_1} h_{A,B}(x) \,dx_1 = \int _{\{ g< \delta \}}\partial _{x_1} h_{A,B}(x) e^{-\frac{F(x)}{2\varepsilon }} e^{\frac{F(x)}{2\varepsilon }} \,dx_1 \nonumber \\&\le \left( \int _{\{ g< \delta \}} |\nabla h_{A,B}(x)|^2 e^{-\frac{F(x)}{\varepsilon }} \,dx_1 \right) ^{\frac{1}{2}} \left( \int _{\{ g < \delta \}} e^{\frac{F(x)}{\varepsilon }} \,dx_1\right) ^{\frac{1}{2}}. \end{aligned}$$
(4.13)

Let us next estimate the last term above. By assumption (1.6) we have, for \(x \in {\hat{O}}\),

$$\begin{aligned} F(x) \le -g(x_1) + \omega (g(x_1) ) + G(x') + \omega (G(x')). \end{aligned}$$

Therefore, by Lemma 3.9 and Proposition 4.1, we can estimate

$$\begin{aligned} \begin{aligned} \int _{\{ g< \delta \}} e^{\frac{F(x)}{\varepsilon }} \,dx_1&\le (1 + {\hat{\eta }}(C,\varepsilon ))e^{\frac{G(x')}{\varepsilon }} e^{\frac{\omega (G(x'))}{\varepsilon }} \int _{\{ g < \delta \}} e^{-\frac{g(x_1)}{\varepsilon }} e^{\frac{ \omega (g(x_1))}{\varepsilon }} \,dx_1 \\&\le (1 + {\hat{\eta }}(C,\varepsilon ))e^{\frac{G(x')}{\varepsilon }} e^{\frac{\omega (G(x'))}{\varepsilon }} \int _{{\mathbb {R}}} e^{-\frac{g(x_1)}{\varepsilon }} \,dx_1 \\&\le (1 + {\hat{\eta }}(C,\varepsilon ))e^{\frac{G(x')}{\varepsilon }} e^{\frac{\omega (G(x'))}{\varepsilon }} d_\varepsilon (B_\varepsilon (x_a),B_\varepsilon (x_b); \Omega ). \end{aligned} \end{aligned}$$
(4.14)

We combine the inequalities (4.13) and (4.14) leading to (for another constant C)

$$\begin{aligned} \int _{\{ g < \delta \}} |\nabla h_{A,B}(x)|^2 e^{-\frac{F(x)}{\varepsilon }} \,dx_1 \ge \frac{(1 - {\hat{\eta }}(C,\varepsilon )) }{ d_\varepsilon (B_\varepsilon (x_a),B_\varepsilon (x_b); \Omega )} e^{-\frac{G(x')}{\varepsilon }} e^{-\frac{\omega (G(x'))}{\varepsilon }} \end{aligned}$$

for all \(x' \in \{ G < \delta /100\}\). By integrating over \(x' \in \{ G < \delta /100\}\) we have by Fubini’s theorem, Lemma 3.9 and Proposition 4.2, that

$$\begin{aligned} \begin{aligned} \int _{{\hat{O}} } |\nabla h_{A,B}|^2 e^{-\frac{F(x)}{\varepsilon }} \,dx&\ge \frac{(1 - {\hat{\eta }}(C,\varepsilon )) }{ d_\varepsilon (B_\varepsilon (x_a),B_\varepsilon (x_b); \Omega )} \int _{\{ G < \delta /100\}} e^{-\frac{G(x')}{\varepsilon }} e^{-\frac{\omega (G(x'))}{\varepsilon }} \, dx'\\&\ge \frac{(1 - {\hat{\eta }}(C,\varepsilon )) }{ d_\varepsilon (B_\varepsilon (x_a),B_\varepsilon (x_b); \Omega )} \int _{{\mathbb {R}}^{n-1}} e^{-\frac{G(x')}{\varepsilon }} \, dx' \\&\ge (1 - {\hat{\eta }}(C,\varepsilon )) \frac{V_\varepsilon (B_\varepsilon (x_a),B_\varepsilon (x_b); \Omega ) }{ d_\varepsilon (B_\varepsilon (x_a),B_\varepsilon (x_b); \Omega )}. \end{aligned} \end{aligned}$$
(4.15)

Therefore, by repeating the argument for every saddle \(z_i \in Z_{x_a, x_b}\) and using the fact that the bridges \(O_{z_i, \delta }\) are disjoint, we obtain after scaling back the potential

$$\begin{aligned} \begin{aligned} \int _{{\mathbb {R}}^n} |\nabla h_{A,B}|^2 e^{-\frac{F(x)}{\varepsilon }} \,dx\ge (1 - {\hat{\eta }}(C,\varepsilon ))\sum _{i=1}^N \frac{V_\varepsilon (B_\varepsilon (x_a),B_\varepsilon (x_b); \Omega ) }{d_\varepsilon (B_\varepsilon (x_a),B_\varepsilon (x_b); \Omega )} e^{\frac{F(z_i)}{\varepsilon }}. \end{aligned} \end{aligned}$$

This yields the lower bound when the saddle points are parallel.

For the upper bound, we only give a sketch of the argument as it is fairly straightforward. The idea is to contruct a competitor h in the variational characterization of the capacity, see (2.5). Let us first define h in the set \(U_{\delta /3} = \{ F < F(x_a;x_b) + \delta /3\}\). Since the saddle points \( Z_{x_a, x_b} = \{ z_1, \dots , z_N\}\) are parallel, it follows that the points \(x_a\) and \(x_b\) lie in different components of the set

$$\begin{aligned} {\widetilde{U}} = U_{\delta /3} \setminus \bigcup _{i=1}^N O_{z_i,\delta }, \end{aligned}$$

where \(O_{z_i,\delta }\) is defined in (1.8). Denote the components of \({\widetilde{U}}\) containing \(x_a\) and \(x_b\) by \({{\widetilde{U}}_{x_a}}\) and \({{\widetilde{U}}_{x_b}}\), respectively. We define first

$$\begin{aligned} h = 1 \, \text { in } \, {{\widetilde{U}}_{x_a}} \quad \text {and} \quad h = 0 \, \text { in } \, {{\widetilde{U}}_{x_b}}. \end{aligned}$$

Let us next fix a saddle point \(z_i \in Z_{x_a, x_b}\). As before, we may again assume that

$$\begin{aligned} z_i = 0, \quad F(0) = 0 \end{aligned}$$

and

$$\begin{aligned} O_{\delta }:= O_{0,\delta } = \{ x_1 \in {\mathbb {R}}: g(x_1)< \delta \} \times \{ x' \in {\mathbb {R}}^{n-1}: G(x') < \delta \}. \end{aligned}$$

Moreover, we may assume that

$$\begin{aligned} {{\widetilde{U}}_{x_a}} \cap \partial O_{\delta } \subset \{ x_1 > 0\} \quad \text { and} \quad {{\widetilde{U}}_{x_b}} \cap \partial O_{\delta } \subset \{ x_1 < 0\}. \end{aligned}$$

Let \(\tau _-<0 < \tau _+ \) be numbers such that \(g(\tau _-) = g(\tau _+) = \delta /100\). We define \(h(x) = \varphi (x_1)\) in \(O_{\delta }\) such that the function \(\varphi :[\tau _-,\tau _+]\rightarrow {\mathbb {R}}\) is a solution of the ordinary differential equation

$$\begin{aligned} \frac{d}{ds} \left( \varphi '(s) e^{\frac{g(s)}{\varepsilon }} \right) = 0 \quad \text {in } \, (\tau _-,\tau _+) \end{aligned}$$

with boundary values \(\varphi (\tau _-) = 0\) and \(\varphi (\tau _+) = 1\). Note that then

$$\begin{aligned} \varphi '(s) e^{\frac{g(s)}{\varepsilon }} = \left( \int _{\tau _-}^{\tau _+} e^{-\frac{g(x_1)}{\varepsilon }} \, dx_1\right) ^{-1}. \end{aligned}$$

We extend \(\varphi \) into \({\mathbb {R}}\) by setting \(\varphi (s) = 0\) for \(s \le \tau _-\) and \(\varphi (s) = 1\) for \(s \ge \tau _+\). It follows that for the function h we have, by construction, Lemma 3.9, and an argument similar to the one leading to (4.13), that

$$\begin{aligned} \begin{aligned}&\int _{O_{\delta } } |\nabla h|^2 e^{-\frac{F(x)}{\varepsilon }} \, dx \\&\quad \le \int _{\{ g< \delta \} } | \varphi '(x_1)|^2 e^{\frac{g(x_1)}{\varepsilon }}e^{\frac{\omega (g(x_1))}{\varepsilon }} \, dx_1 \int _{\{ G< \delta \} } e^{-\frac{G(x')}{\varepsilon }} e^{\frac{\omega (G(x'))}{\varepsilon }}\, dx' \\&\quad \le (1+ {\hat{\eta }}(C,\varepsilon )) \left( \int _{{\mathbb {R}}} e^{-\frac{g(x_1)}{\varepsilon }} \, dx_1\right) ^{-1} \left( \int _{{\mathbb {R}}^{n-1}} e^{-\frac{G(x')}{\varepsilon }} \, dx'\right) . \end{aligned} \end{aligned}$$

By repeating the construction for every saddle point \(z_i \in Z_{x_a, x_b}\), we obtain a function which is defined in \(U_{\delta /3}\). We denote this function by \(h: U_{\delta /3} \rightarrow {\mathbb {R}}\). Note that now for h the estimate (4.15) is optimal. Moreover, h is Lipschitz continuous. We extend h to \({\mathbb {R}}^n\) without increasing the Lipschitz constant L, e.g., by defining

$$\begin{aligned} h(x) = \sup _{y \in U_{\delta /3}} \big ( h(y) -L|x-y|\big ) \quad \text {for } \, x \in {\mathbb {R}}^n \setminus U_{\delta /3}. \end{aligned}$$

This finally leads to the upper bound completing the proof of the parallel case, while we leave the final details on the upper bound for the reader.

4.1.2 Series case

Assume that the saddle points \(Z_{x_a, x_b} = \{ z_1, \dots , z_N\}\) are in series, see Fig. 2. We use the ordering as in (1.11) and denote the points \(x_i\) as in (1.12). We also fix the islands, \(U_{x_{i-1}}\) and \(U_{x_i}\) (components of \(\{ F < F(x_a; x_b) -\delta /3\}\)), which are connected by the bridge \(O_{z_i, \delta }\). Again we may assume that \(z_i = 0\), \(F(0) = 0\) and that

$$\begin{aligned} O_{ \delta } =O_{z_i, \delta } = \{y_1: g(y_1)< \delta \} \times \{y': G(y') < \delta \}. \end{aligned}$$

By Lemma 3.5 we have \(\text {osc}_{U_{x_{i-1}}}(h_{A,B}) + \text {osc}_{U_{x_{i}}}(h_{A,B}) \le C \varepsilon \). Therefore there are numbers \(c_{i-1},c_i\) such that

$$\begin{aligned} | h_{A,B} - c_{i-1} | \le C\varepsilon \,\, \text {in } \,U_{x_{i-1}} \quad \text {and} \quad | h_{A,B} - c_i | \le C\varepsilon \,\, \text {in } \, U_{x_{i}}. \end{aligned}$$

Then, using the fundamental theorem of calculus as in (4.13), we obtain

$$\begin{aligned} \begin{aligned} |c_{i-1}-c_i|- 2C\varepsilon \le \left( \int _{\{g<\delta \} } |\nabla h_{A,B}(y)|^2 e^{-\frac{F(y)}{\varepsilon }} \,dy_1 \right) ^{\frac{1}{2}} \left( \int _{\{g<\delta \} } e^{\frac{F(y)}{\varepsilon }} \,dy_1\right) ^{\frac{1}{2}} \end{aligned} \end{aligned}$$

for

$$\begin{aligned} (y_1,y') \in \{ g<\delta \} \times \{ G <\delta /100\}. \end{aligned}$$

Moreover, arguing as in (4.14), we have

$$\begin{aligned} \int _{\{g<\delta \} } e^{\frac{F(y)}{\varepsilon }} \,dy_1 \le (1 + {\hat{\eta }}(C,\varepsilon ))e^{\frac{G(y')}{\varepsilon }}e^{\frac{\omega (G(y'))}{\varepsilon }} d_\varepsilon (x_{i-1},x_i). \end{aligned}$$

These together imply

$$\begin{aligned} \int _{\{g<\delta \}} |\nabla h_{A,B}(y)|^2 e^{-\frac{F(y)}{\varepsilon }} \,dy_1 \ge \frac{(1 - {\hat{\eta }}(C,\varepsilon )) (c_{i-1}-c_i)^2}{ d_\varepsilon (x_{i-1},x_i)} e^{-\frac{G(y')}{\varepsilon }}e^{-\frac{\omega (G(y'))}{\varepsilon }}. \end{aligned}$$

By integrating over \(y' \in \{ G < \delta /100\}\) we have, by Fubini’s theorem, Lemma 3.9, and Proposition 4.2, that

$$\begin{aligned} \int _{O_{\delta }} |\nabla h_{A,B}|^2 e^{-\frac{F(y)}{\varepsilon }} \,dy \ge (1 - {\hat{\eta }}(C, \varepsilon )) (c_{i-1}-c_i)^2 \frac{V_\varepsilon (x_{i-1},x_i) }{ d_\varepsilon (x_{i-1},x_i)}. \end{aligned}$$

By repeating the argument for every saddle \(z_i \in Z_{x_a, x_b}\) and using the fact that the sets \(O_{z_i, \delta }\) are disjoint we obtain

$$\begin{aligned} \int _{{\mathbb {R}}^n} |\nabla h_{A,B}|^2 e^{-\frac{F(y)}{\varepsilon }} \,dy\ge (1 - {\hat{\eta }}(C,\varepsilon )) \sum _{i=1}^N (c_{i-1}-c_i)^2 \frac{V_\varepsilon (x_{i-1},x_i) }{ d_\varepsilon (x_{i-1},x_i)}. \end{aligned}$$

Recall that the numbers \(c_i\) are the approximate values of \(h_{A,B}\) in the components \(U_{x_{i}}\). Therefore we may choose them such that \(1 = c_0\) and \(c_{N} =0\). By denoting \(y_i = c_{i-1}-c_i\) and \(a_i = \frac{V_\varepsilon (x_{i-1},x_i) }{ d_\varepsilon (x_{i-1},x_i)}\) we may write

$$\begin{aligned} \sum _{i=1}^N (c_{i-1}-c_i)^2 \frac{V_\varepsilon (x_{i-1},x_i) }{ d_\varepsilon (x_{i-1},x_i)} = \sum _{i=1}^N a_i y_i^2, \end{aligned}$$

where we have a constraint \(\sum _{i=1}^N y_i=1\). By a standard optimization argument (using Lagrange multipliers) we get that under such a constraint it holds that

$$\begin{aligned} \sum _{i=1}^N a_i y_i^2 \ge \left( \sum _{i=1}^N \frac{1}{a_i} \right) ^{-1}. \end{aligned}$$

This yields the lower bound in the case when the saddle points are in series. The upper bound on the other hand follows from a similar argument than in the parallel case, and we leave the details for the reader. This completes the proof in the series case, and hence the whole proof. \(\square \)

4.2 Proof of Theorem 2

Let us first recall the notation related to Theorem 2. We assume that the local minima \(x_i\) of F are ordered such that \(F(x_i) \le F(x_j)\) if \(i \le j\), and they are grouped into sets \(G_i\) such that \(x_i, x_j \in G_k\) if \(F(x_i) = F(x_j)\) and \(x \in G_i\), \(y \in G_j\) with \(i <j\) if \(F(x) < F(y)\). We also denoted \(F(G_i):= F(x)\) with \(x \in G_i\), \(S_k = \bigcup _{i=1}^k G_i\), \(G_k^\varepsilon = \bigcup _{x \in G_k} B_\varepsilon (x)\), and \(S_k^\varepsilon = \bigcup _{i=1}^k G_i^\varepsilon \).

The proof of Theorem 2 follows from the following lemma together with Lemma 3.6 and Theorem 1.

Lemma 1.23

Under the assumptions of Theorem 2, there exists constants \(C=C(F) > 1\) and \(\varepsilon =\varepsilon _0(F) \in (0,1)\) such that, for all \(0 < \varepsilon \le \varepsilon _0\), we have

$$\begin{aligned} \frac{1}{C} \sum _{x \in G_{k+1}} |O_{x,\varepsilon }| \le e^{\frac{F(G_{k+1})}{\varepsilon }} \int h_{G_{k+1}^\varepsilon ,S_k^\varepsilon } d\mu _\varepsilon \le C \sum _{x \in G_{k+1}} |O_{x,\varepsilon }|. \end{aligned}$$

We prove Theorem 2 first, while the proof of Lemma 4.3 is given later on.

Proof of Theorem 2

Using Lemma 3.6 and choosing \(\rho = \varepsilon \) we obtain that, for \(\varepsilon \) small enough and \(x \in G_{k+1}^\varepsilon \), that

$$\begin{aligned} {\mathbb {E}}^x[\tau _{S_k^\varepsilon } {\mathbb {I}}_{\tau _{S_k^\varepsilon } < \tau _{\Omega ^c}}] \le C \frac{{\displaystyle \int } h_{{S_k^\varepsilon },{\Omega ^c}} h_{G_{k+1}^\varepsilon ,{S_k^\varepsilon }} d\mu _\varepsilon }{{\text {cap}}(G_{k+1}^\varepsilon ,{S_k^\varepsilon })} + C \varepsilon ^{\alpha /2}. \end{aligned}$$

The ratio above can be estimated by using Lemma 4.3 and the monotonicity of the capacity. That is, the numerator can be bounded by Lemma 4.3, while for the capacity we have

$$\begin{aligned} {\text {cap}}(G_{k+1}^\varepsilon ,{S_k^\varepsilon }) \ge {\text {cap}}(G_{k+1}^\varepsilon ,{G_k^\varepsilon }) \ge \max _{x \in G_k, y \in G_{k+1} } {\text {cap}}(B_\varepsilon (x),{B_\varepsilon (y)}). \end{aligned}$$
(4.16)

The claim for the parallel and series cases now follows by assuming that the maximum is attained for a pair of minima \(x_a \in G_k\), \(x_b \in G_{k+1}\) and applying Theorem 1. \(\square \)

Remark 4.4

We note that in the general case, the last inequality in (4.16) has the optimal dependence with respect to \(\varepsilon \) but the inequalities may differ by a constant. Essentially the inequality is sharp only in the case where only one saddle contributes to the total value of the capacity. Hence we have the sharp estimate when saddle points are parallel or in series, but in general the situation might be more complicated than that. We have illustrated this in Fig. 4, where each gray dot is a saddle at the same height, and AB produces \(G_k\). Then the precise value of \({\text {cap}}(G_{k+1}^\varepsilon ,A \cup B)\) is already non-trivial to calculate.

Fig. 4
figure 4

Geometric view of multiple minima at same height and multiple saddles at the same height

Proof of Lemma 4.3

First we will prove a localization estimate for exponential integrals. Consider a set \(0 \in O\) and a function f such that \(f(0) = l\) is a proper local minimum and that f is locally convex around 0. Then there exists an \(\varepsilon _0\) such that, for any \(\varepsilon < \varepsilon _0\),

$$\begin{aligned} \int _O e^{-f(x)/\varepsilon } dx \le C e^{-l/\varepsilon }|\{f < \varepsilon \} \cap O|. \end{aligned}$$
(4.17)

We will first prove (4.17) and then repeatedly apply it to prove Lemma 4.3.

In order to prove (4.17), we begin by rescaling such that \(l = 0\). Then we extend f outside O as \(+\infty \) and call this extended function \({\hat{f}}\). We first prove

$$\begin{aligned} \int _{\{{\hat{f}} > a\}} e^{-{\hat{f}}(x)/\varepsilon } dx \le c e^{-a/\varepsilon } \end{aligned}$$

which, by the definition of \({\hat{f}}\), is equivalent to

$$\begin{aligned} \int _{\{f > a\} \cap O} e^{-f(x)/\varepsilon } dx \le c e^{-a/\varepsilon }. \end{aligned}$$

This now follows from Lemma 3.8 by using \(|O| < \infty \) and observing that \({\hat{f}}\) satisfies the assumptions of Lemma 3.8. Hence we observe (4.17).

Consider now the set

$$\begin{aligned} U_{-\delta _2/3} \equiv \{y: F(y) \le F(G_{k+1};S_k )-\delta _2/3\} \end{aligned}$$

and let \(U_i\) be the component of \(U_{-\delta _2/3}\) containing \(x_i\). We split

$$\begin{aligned} \int h_{G_{k+1}^\varepsilon ,S_k^\varepsilon } e^{-F/\varepsilon } dx = \int _{U_{-\delta _2/3}^c}h_{G_{k+1}^\varepsilon ,S_k^\varepsilon } e^{-F/\varepsilon } dx + \int _{U_{-\delta _2/3} \setminus S_k^\varepsilon }h_{G_{k+1}^\varepsilon ,S_k^\varepsilon } e^{-F/\varepsilon } dx, \end{aligned}$$

where complement is understood with respect to the domain \(\Omega \). By assumptions (1.4) and (1.6) on F it holds that

$$\begin{aligned} F(G_{k+1};S_k ) \ge F(G_{k+1}) + \tfrac{2}{3} \delta _2. \end{aligned}$$
(4.18)

Also by the quadratic growth (1.3) we can bound the first integral as

$$\begin{aligned} \int _{U_{-\delta _2/3}^c}h_{G_{k+1}^\varepsilon ,S_k^\varepsilon } e^{-F/\varepsilon } dx \le C e^{-(F(G_{k+1};S_k )-\delta _2/3)/\varepsilon } \le C e^{-\frac{\delta _2}{3\varepsilon }} e^{-F(G_{k+1})/\varepsilon }, \end{aligned}$$

which shows that the first integral is neglible in the final estimate. For the second integral we further split

$$\begin{aligned} \int _{U_{-\delta _2/3} \setminus S_k^\varepsilon }h_{G_{k+1}^\varepsilon ,S_k^\varepsilon } e^{-F/\varepsilon } dx = \sum _{i} \int _{U_i \setminus S_k^\varepsilon }h_{G_{k+1}^\varepsilon ,S_k^\varepsilon } e^{-F/\varepsilon } dx. \end{aligned}$$

We will now consider all the different components \(U_i\) depending on what minima they contain. We start with the components \(U_i\) that do not intersect \(S_k^\varepsilon \cup G_{k+1}^\varepsilon \). Then all local minima in \(U_i\) are larger than \(F(G_{k+1})\), and hence from (4.17) we get that there exists a constant C such that

$$\begin{aligned} \int _{U_i} h_{G_{k+1}^\varepsilon ,S_k^\varepsilon } e^{-F/\varepsilon } dx \le C e^{-F(G_{k+2})/\varepsilon } \le Ce^{-\delta _2/\varepsilon } e^{-F(G_{k+1})/\varepsilon }, \end{aligned}$$

where the last inequality follows from (1.19). This shows that also this term is neglible. Consider next the component \(U_i\) that intersects \(G^\varepsilon _{k+1}\) but do not intersect \(S_k^\varepsilon \). In this case, by (4.17) and Lemmas 3.5 and 3.7, we have

$$\begin{aligned} \frac{1}{C} \sum _{x \in G_{k+1} \cap U_i} |O_{x,\varepsilon }| \le e^{\frac{F(G_{k+1})}{\varepsilon }} \int _{U_i} h_{G_{k+1}^\varepsilon ,S_k^\varepsilon } e^{-F/\varepsilon } dx \le C \sum _{x \in G_{k+1} \cap U_i} |O_{x,\varepsilon }| \end{aligned}$$
(4.19)

providing us the leading term that contributes to the final estimate.

Consider next a component \(U_i\) such that \(U_i \cap S_k^\varepsilon \ne \varnothing \). Since \(U_i\) is a component of \(U_{-\delta _2/3}\), it follows from \(U_i \cap S_k \ne \varnothing \) that \(F(y;S_k) \le F(y;G_{k+1})-\delta _2/3 \le F(y;G_{k+1})\) in \(U_i\). Therefore we have, by Lemma 3.3, in \(U_i\) that

$$\begin{aligned} h_{G_{k+1}^\varepsilon ,S_k^\varepsilon } \le C \varepsilon ^{q} e^{-(F(y;G_{k+1} )-F(y;S_k ))/\varepsilon }. \end{aligned}$$

Hence, for \(q\in {\mathbb {R}}\), we obtain

$$\begin{aligned} \int _{U_i } h_{G_{k+1}^\varepsilon ,S_k^\varepsilon } e^{-F/\varepsilon } dx \le \varepsilon ^{q} \int _{U_i }e^{-(F(y;G_{k+1})-F(y;S_k))/\varepsilon } e^{-F(y)/\varepsilon } dy. \end{aligned}$$

In order to compute the integral on the right hand side we study the infimum value of the function \(f(y) = F(y;G_{k+1} )-F(y;S_k )+F(y)\). Clearly, the infimum is attained at an interior point of \(U_i\), denoted by \(x_i\). It follows that then \(x_i\) is necessarily a local minimum point of F. By above considerations, we also have \(F(y;S_k ) < F(y;G_{k+1})\) for all \(y \in U_i\), and thus we may deduce that \(x_i \notin G_{k+1}\). If now \(x_i \in S_k\), then \(F(x_i) = F(x_i;S_k)\) and thus, by the definition of f and by (4.18),

$$\begin{aligned} \inf _{U_i} f(y) = f(x_i) \ge F(x_i;G_{k+1}) \ge F(S_k;G_{k+1}) \ge F(G_{k+1}) + \tfrac{2}{3} \delta _2. \end{aligned}$$

It remains to study the case where \(x_i \in G_j\) for some \(j \ge k+2\). In this case we apply

$$\begin{aligned} \inf _{U_i} f(y) = f(x_i) = \overbrace{F(x_i;G_{k+1})-F(x_i;S_k)}^{\ge 0}+F(G_{j}) \ge F(G_{k+2}) \ge F(G_{k+1}) + \delta _2, \end{aligned}$$

where the last inequality follows from (1.19). Therefore we can conclude that, for \(\delta _3 = \tfrac{2}{3} \delta _2\), it holds that

$$\begin{aligned} \int _{U_i } \varepsilon ^{q} e^{-(F(y;G_{k+1})-F(y;S_k))/\varepsilon } e^{-F(y)/\varepsilon } dy \le C\varepsilon ^{q} e^{-\frac{\delta _3}{\varepsilon }} e^{-F(G_{k+1})/\varepsilon }. \end{aligned}$$

Consequently, the component \(U_i\) satisfying \(U_i \cap S_k^\varepsilon \ne \varnothing \) does not contribute either. The proof is hence completed by (4.19) and by the fact that the integral over the remaining components are neglible whenever \(\varepsilon \) is small enough. \(\square \)