Metastability of hard-core dynamics on bipartite graphs

We study the metastable behaviour of a stochastic system of particles with hard-core interactions in a high-density regime. Particles sit on the vertices of a bipartite graph. New particles appear subject to a neighbourhood exclusion constraint, while existing particles disappear, all according to independent Poisson clocks. We consider the regime in which the appearance rates are much larger than the disappearance rates, and there is a slight imbalance between the appearance rates on the two parts of the graph. Starting from the configuration in which the weak part is covered with particles, the system takes a long time before it reaches the configuration in which the strong part is covered with particles. We obtain a sharp asymptotic estimate for the expected transition time, show that the transition time is asymptotically exponentially distributed, and identify the size and shape of the critical droplet representing the bottleneck for the crossover. For various types of bipartite graphs the computations are made explicit. Proofs rely on potential theory for reversible Markov chains, and on isoperimetric results. In a follow-up paper we will use our results to study the performance of random-access wireless networks.


Introduction and main results
1 Introduction and main results 1

.1 Background
A metastable state in a physical system is a quasi-equilibrium that persists on a short time scale but relaxes to an equilibrium on a long time scale, called a stable state. Such behaviour often shows up when the system resides in the vicinity of a configuration where its energy has a local minimum and is subjected to a small noise: in the short run the noise is unlikely to have a significant impact on the system, whereas in the long run the noise pulls the system away from the local minimum and triggers a rapid transition towards a global minimum. When and how this transition occurs depends on the depths of the energy valley around the metastable state and the shape of the bottleneck separating the metastable state from the stable state, called the set of critical droplets.
Recently, there has been interest in metastability for interacting particle systems on graphs, which is much more challenging because of lack of periodicity. See Dommers [22], Jovanovski [36], Dommers, den Hollander, Jovanovski and Nardi [23], den Hollander and Jovanovski [32], for examples. In these papers the focus is on Ising spins subject to a Glauber spin-flip dynamics. Particularly challenging are cases where the graph is random, because the key quantities controlling the metastable crossover depend on the realisation of the graph.
In the present paper, we study the metastable behaviour of a stochastic system of particles with hard-core interactions in a high-density regime. Particles sit on the vertices of a bipartite graph. New particles appear subject to a neighbourhood exclusion constraint, while existing particles disappear, all according to independent Poisson clocks. We consider the regime in which the appearance rates are much larger than the disappearance rates, and there is a slight imbalance between the appearance rates on the two parts of the graph. Starting from the configuration in which the weak part (with the smaller appearance rate) is covered with particles (= metastable state), the system takes a long time before it reaches the configuration in which the strong part (with the larger appearance rate) is covered with particles (= stable state).
We develop an approach for the hard-core model on general bipartite graphs that reduces the description of metastability to understanding the isoperimetric properties of the graph. The Widom-Rowlinson model on a given graph fits into our setting as the hard-core model on an associated bipartite graph we call the doubled graph. Exploiting the isoperimetric properties of the graph, we are able to obtain a sharp asymptotic estimate for the expected transition time, show that the transition time is asymptotically exponentially distributed, and identify the size and shape of the critical droplet. Interesting examples include the even torus, the doubled torus, the regular tree-like graphs (with high girth) and the hypercube. The isoperimetric problem we deal with is non-standard, but in some cases it can be reduced to certain standard edge/vertex isoperimetric problems. In the case of the even torus and the doubled torus, we derive EJP 23 (2018), paper 97. complete information on the isoperimetric problem and hence obtain a complete description of metastability. In the case of the regular tree-like graphs and the hypercube our understanding of the isoperimetric problem is less complete, but we are still able to obtain some relevant information on metastability. Proofs rely on potential theory for reversible Markov chains and on isoperimetric results.
Earlier work on the same model [43] focused on the case where the appearance rates are balanced, and lead to results in the high-density regime for the transition time between the two stable configurations in probability, in expected value and in distribution for finite lattices. The general framework in [43] was also exploited to derive results for the balanced hard-core model on non-bipartite graphs (e.g. the triangular lattice) [49] and for the Widom-Rowlison model [48].
In follow-up work we will use our results to study the performance of random-access wireless networks. Here, customers arrive at the nodes of the network, but not all the nodes are able to serve their customers at all times. Each node can be either active or inactive, and two nodes connected by a bond cannot be active simultaneously. This situation arises in random-access wireless networks where, due to destructive interference, stations that are close to each other cannot use the same frequency band at the same time. The nodes switch themselves on and off at a prescribed rate that depends on how long they have been inactive, respectively, active. This switching protocol allows the nodes to share the frequency band among one another. In [10] we analyse what happens when the switching protocol is externally driven (i.e., given by prescribed switching rates), in [9] when it is internally driven (i.e., given by the queue lengths). The general problem is described in [50], where the need to develop mathematical tools to assess the efficiency of different switching protocols is argued.
The remainder of the paper is organised as follows. In Section 1.2 we define the model. In Section 1. 3 we state and discuss three metastability theorems, which constitute our main results. In Section 2 we provide a general description of metastable behaviour of hard-core dynamics on bi-partite graphs, distinguishing between 'simple examples' and 'sophisticated examples '. In Section 3 we make some preparations for the analysis of the 'sophisticated examples'. Section 4 gives the proof of the three metastability theorems. Section 5 is devoted to the study of certain isoperimetric problems that arise in the identification of the critical droplet. Section 6 describes in more detail what is implied by the three metastability theorems for various concrete examples.
Along the way we need various tools from potential theory that are basic yet not entirely standard. These are collected in three appendices in order to smoothen the presentation. In Appendix A we recall the main ingredients of potential theory for reversible Markov chains, including the Nash-Williams inequalities for estimating effective resistance. In Appendix B we develop a formulation of metastability for a parametrized family of reversible Markov chains in a relevant asymptotic regime. In Appendix C we provide proofs of various claims made in Section 5 and Appendices A and B, as well as an important proposition in Section 1.3 identifying the critical gate for the metastable crossover.

Model
We consider a system of particles living on a (finite, simple, undirected) connected graph G = (V (G), E(G)), where V (G) is the set of vertices and E(G) is the set of edges between them. We refer to vertices as sites. Each site of the graph can carry 0 or 1 particle, but we impose the constraint that two adjacent sites cannot carry particles simultaneously. A (valid) configuration of the model is thus an assignment x : V (G) → {0, 1} such that, for each pair of adjacent sites i, j, either x i = 0 or x j = 0.
Alternatively, a valid configuration can be identified by an independent set of the graph, EJP 23 (2018), paper 97. i.e., a subset x ⊆ V (G) of sites having no edges between them. We will use these two representations interchangeably, and with some abuse of notation use the same symbol to denote the map x : V (G) → {0, 1} or the subset x ⊆ V (G). The set of valid configurations is denoted by X ⊆ {0, 1} V (G) .
The configuration of the system evolves according to a continuous-time Markov chain. Particles appear or disappear independently at each site, at fixed rates depending on the site and subject to the exclusion constraint. Namely, each site k has two associated Poisson clocks ξ b k and ξ d k , signalling the (attempted) birth and death of particles: Birth: Clock ξ b k has rate λ k > 0. Every time ξ b k ticks, an attempt is made to place a particle at site k. If one of the neighbours of site k carries a particle, or if there is already a particle at k, then the attempt fails.
Death: Clock ξ d k has rate 1. Every time ξ d k ticks, an attempt is made to remove a particle from site k. If the site is already empty, then nothing is changed.
All the clocks are assumed to be independent.
The parameter λ k is called the activity or fugacity at site k. We are interested in the asymptotic regime where λ k 1. It is easy to verify that the distribution (where Z is the appropriate normalising constant) is the unique (reversible) equilibrium distribution for this Markov chain. Note that when λ k 1, the distribution π is mostly concentrated at configurations that are close to maximal packing. We prefer to develop our theory in the discrete-time setting. Therefore, we simulate the above continuous-time Markov chain by means of a single Poisson clock ξ with rate γ k∈V (G) (λ k + 1) and a discrete-time Markov chain (independent of the clock) in the standard fashion. In this case, the discrete-time Markov chain becomes a Gibbs sampler for the distribution π: a transition of the discrete-time chain is made by first picking a random site I with distribution (i → 1+λi γ ), and afterwards resampling the state of site I according to π conditioned on the rest of the current configuration, i.e., according to (0 → 1 1+λ I , 1 → λ I 1+λ I ) if the current configuration has no particle in the neighbourhood of I, and (0 → 1, 1 → 0) otherwise. More explicitly, the transition probability from a configuration x to a configuration y = x (both in X ) is given by K(x, y) =      λ i /γ if x i = 0, y i = 1, and x V (G)\{i} = y V (G)\{i} , 1/γ if x i = 1, y i = 0, and x V (G)\{i} = y V (G)\{i} , 0 otherwise.
( 1.2) The probability K(x, x) is simply chosen so as to make K a stochastic matrix.
In summary, the discrete-time chain (X(n)) n∈N (where N {0, 1, 2, . . .}) and the continuous-time chain (X(t)) t∈[0,∞) are connected via the couplingX(t) X(ξ([0, t])), where ξ is a Poisson process with rate γ independent of (X(n)) n∈N  The above process is the dynamic version of the hard-core gas model. Throughout this paper, we assume that the underlying graph is bipartite, i.e., the sites of the graph can be partitioned into two disjoint sets U and V in such a way that every edge of the graph has one endpoint in U and the other endpoint in V . In the sequel, we will assume that λ k = λ for all k ∈ U and λ k =λ for all k ∈ V , where λ,λ ∈ R + . A simple example of a bipartite graph on which the hard-core dynamics exhibits very strong metastable EJP 23 (2018), paper 97. behaviour is the complete bipartite graph (Fig. 3a) in which every site in U is connected by an edge to every site in V : starting from the configuration u with particles at every site in U , the system must first remove every single particle from U in order to be able to place a particle on V and eventually reach the configuration v with particles at every site in V . A more interesting example is an even torus graph Z m × Z n (m and n even) with nearest-neighbour edges, in which case U and V can be chosen to be the sets of sites for which the sum of the coordinates is even or odd, respectively (Fig. 1a). A further class of interesting examples arises from the two-species Widom-Rowlinson model, which has an equivalent representation in our setting. The (dynamic) Widom-Rowlinson model (see e.g. Lebowitz and Gallavotti [39]) is similar. In this model there are two types of particles, red and blue. Again, each site of the graph can be occupied by at most one particle, which can be of either type, but the exclusion constraint acts between opposite types only: two particles of opposite colour cannot simultaneously sit on two neighbouring sites. The dynamics is governed by three families of independent Poisson clocks: Birth of red: Clock ξ rb k has rate λ r > 0. Every time ξ rb k ticks, an attempt is made to place a red particle at site k. If one of the neighbours of site k carries a blue particle, or if there is already a particle on k, then the attempt fails.
Birth of blue: Clock ξ bb k has rate λ b > 0. Every time ξ bb k ticks, an attempt is made to place a blue particle at site k. If one of the neighbours of site k carries a red particle, or if there is already a particle on k, then the attempt fails.
Death: Clock ξ d k has rate 1. Every time ξ d k ticks, an attempt is made to remove a particle from site k. If the site is already empty, then nothing is changed.
The Widom-Rowlinson model on a graph G = (V (G), E(G)) has a faithful representation in terms of the hard-core process on a bipartite graph G [2] obtained from G, which we call the doubled version of G (see Fig. 2). The graph G [2] has vertex set V (G [2] ) V (G) × {r, b} with two parts U [2] {(k, r) : k ∈ V (G)} and V [2] {(k, b) : k ∈ V (G)}, which are the coloured copies of V (G). There is an edge between a red site (i, r) and a blue site (j, b) if and only if either i = j or (i, j) is an edge in E(G) (Fig. 2). There are no edges between red sites nor between blue sites. The configurations of the Widom-Rowlinson model on G are in obvious one-to-one correspondence with the configurations of the hard-core model on G [2] . Namely, a configuration x of the Widom-Rowlinson model corresponds to a configuration x [2] of the hard-core model on the doubled graph where x i = r if and only if x (i,r) = 1 and x i = b if and only if x (i,b) = 1. Furthermore, this correspondence is respected by the stochastic dynamics of the two models. So in short, EJP 23 (2018), paper 97. studying the Widom-Rowlinson model on G amounts to studying the hard-core model on the doubled graph G [2] .
The doubled graph G [2] (c) A different drawing of G [2] Figure 2: A graph and its doubled version.

Three metastability theorems
For the hard-core model on a bipartite graph (U, V, E), we write u for the configuration that has a particle at every site of U , and v for the configuration that has a particle at every site of V . For the activity parameters, we choose λ k = λ for k ∈ U and λ k =λ for k ∈ V , and we assume that for some constant 0 < α < 1. In other words, the activities of the sites in V are slightly stronger than the sites in U , and in particular λ = o(λ). The symmetric scenario in which α = 0 is treated by Nardi, Zocca and Borst [43]. The assumption α < 1 is not crucial but will shorten the arguments at the cost of excluding the less interesting cases in which the critical droplet is trivial. In the present paper, we focus on the case in which |U | < (1 + α) |V |. This ensures that v has the largest stationary probability among all configurations. The opposite case can be treated similarly.
When λ → ∞, we expect noticeable metastability when starting from u. Namely, although the configuration v takes up the overwhelmingly largest portion of the equilibrium probability mass, the process starting from u remains in the vicinity of u for a long time before the formation of a 'critical droplet' and the eventual transition to v. The choice λ 1+α+o(1) for ϕ(λ) ensures that the size of the critical droplet is non-trivial (neither going to 0 nor to ∞ as λ → ∞). With this choice, we may think of as an appropriate notion of energy or height of configuration x, although we should keep in mind that the probability π(x) and the height H(x) are related only through the asymptotic equality π(x) = 1 Z λ −H(x)+o(1) . (In particular, note that the factor λ o(1) is allowed to go to ∞ as λ → ∞.) This interpretation provides the connection with the usual setting of metastability on which the current paper is based. As it turns out, the factor λ o(1) does not alter the size or shape of the critical droplet, and only affects the transition time (see also Cirillo, Nardi and Sohier [20]).
On a typical transition path from u to v, the configurations near the bottleneck (i.e., those representing the critical droplet) solve a (non-standard) isoperimetric problem on the underlying bipartite graph. The isoperimetric cost of a set A ⊆ V is defined , the set of sites in U with a neighbour in A ⊆ V . The smallest possible isoperimetric cost for a set of cardinality s is denoted by ∆(s). A set that achieves this minimum is said to be isoperimetrically optimal. The isoperimetric problem associated with the graph (U, V, E) EJP 23 (2018), paper 97. asks for the optimal values ∆(s) and the optimal sets. An isoperimetric numbering is a sequence a 1 , a 2 , . . . , a n of distinct elements in V such that for each 1 ≤ i ≤ n, the set A i {a 1 , a 2 , . . . , a i } is isoperimetrically optimal. Our main results concern the hard-core model on a bipartite graph with the above choices of the relevant parameters, and rely on fairly general (though not necessarily easily verifiable) hypotheses regarding the isoperimetric properties of the underlying graph. These hypotheses are not the most general possible and can certainly be relaxed. Our goal is to show how they can be put to use in a few concrete examples: the torus Z m × Z n (where m and n are sufficiently large even numbers), the hypercube Z m 2 , regular tree-like graphs and the doubled versions of these (see Fig. 1-2). In the case of the torus, where we have a rather complete understanding of the isoperimetric properties (via reduction to standard isoperimetric problems), we verify that all the required hypotheses are indeed satisfied. For the other examples, we are able to verify only some of the hypotheses, thereby obtaining only partial results. Complete descriptions remain contingent upon a better understanding of the corresponding isoperimetric problems.
Our first two theorems establish asymptotics for the mean and the distribution of the crossover time (i.e., the hitting time of v starting from u). Let s * be the smallest positive integer maximising g(s) ∆(s) − α(s − 1). We call s * the critical size. Lets be the smallest integer larger than s * such that ∆(s) ≤ αs. We calls the resettling size. The required hypotheses for these two theorems are the following: H1 There exists an isoperimetric numbering of length at leasts.
H2 For every a ∈ V , there exists an isoperimetric numbering of length at leasts starting with a.
Clearly (H2) implies (H1). In fact, the following theorems require the stronger hypothesis (H2) but we have stated (H1) for future reference. The existence of the resettling size is ensured by hypothesis (H0).
For the next theorem, we need a few extra definitions and hypotheses. Note that Theorem 1.1 provides only the order of magnitude of the mean crossover time E u [T v ] as λ → ∞. A more accurate asymptotics (the pre-factor) requires a more detailed description of the bottleneck (the critical droplets), which in turn requires a better understanding of the isoperimetric properties of the underlying graph. More specifically, we need an understanding of the evolution of the set of occupied sites in V during the crossover from u to v. We call a sequence of sets A 0 , A 1 , . . .
H4 There exist two families A, B of subsets of V such that (a) the elements of A and B are isoperimetrically optimal with |A| = s * − 1 for each A ∈ A and |B| = s * for each B ∈ B, (b) for each A ∈ A, there is an isoperimetric progression from ∅ to A, consisting only of sets of size at most s * − 1.
(c) for each B ∈ B, there is an isoperimetric progression from B to a set of sizes, consisting only of sets of size at least s * , We interpret an element of B as a critical droplet on V . Given two families A and B satisfying (H4), we define two sets of configurations Q and Q * as follows. The set Q * consists of configurations y such that y V = A and y U = U \ N (B) for some A ∈ A and B ∈ B with |B \ A| = 1. A configuration x is in Q if it can be obtained from a configuration y ∈ Q * by adding a particle on U . We denote by [Q, Q * ] the set of possible transitions x → y where x ∈ Q and y ∈ Q * . In other words, [Q, Q * ] consists of pairs (x, y) ∈ Q × Q * such that x and y differ by a single particle.
With probability approaching 1 as λ → ∞, the random trajectory from u to v makes precisely one transition x → y from [Q, Q * ], every configuration that follows the transition x → y has at least s * particles on V , and every configuration preceding x → y has at most s * − 1 particles on V . Moreover, the choice of the transition x → y is uniform among all possibilities in [Q, Q * ]. Let 0 ≤ κ < 1 /α be an integer (e.g., κ 1 /α − 1) and define A {A ⊆ V : A is isoperimetrically optimal with |A| = s * − 1} , (1.9) C {C ⊆ V : C is isoperimetrically optimal with |A| = s * + κ} , there exists an isoperimetric progression Observe that |B| = s * for every B ∈ B. Consider the following hypotheses: H6 (a) For each A ∈ A, there is an isoperimetric progression from ∅ to A, consisting only of sets of size at most s * − 1.
(b) For each C ∈ C, there is an isoperimetric progression from C to a set of sizes, consisting only of sets of size at least s * .  '. Appendix A recalls some basic facts from potential theory for reversible Markov chains. Appendix B provides a characterisation of metastability in terms of recurrence of metastable states and passage through bottlenecks. Appendix C collects the proofs of all the propositions and lemmas appearing in Section 5 and Appendices A and B. Proposition 1.4 is proved in Appendix C.12 via a detailed study of typical paths near the critical droplet in Section 3.5.

Hard-core dynamics on bipartite graphs
In this section, we describe the metastable behaviour of the hard-core process on bipartite graphs. We use the setting of Section 1, and along the way use some basic results that are collected in Appendices A-B, adding pointers to the relevant definitions listed there. After some preparatory observations (Section 2.1), we start by listing a few 'simple examples' for which the above task can be carried out via simple inspection (Section 2.2). For more 'sophisticated examples' the problem of identifying the critical resistance and the critical gate lead to a (non-standard) combinatorial isoperimetric problem (Section 2.3).

Preparatory observations
Recall that the underlying bipartite graph has two parts U and V . Particles are added to or removed from each site independently with constant rates and subject to the exclusion constraints prescribed by the graph. The rates of adding particles to empty sites in U and V are λ andλ, respectively, and the rate of removing a particle from a site is 1. We assume thatλ = ϕ(λ) = λ 1+α+o(1) as λ → ∞, where 0 < α < 1. We write u EJP 23 (2018), paper 97. and v to denote the fully-packed configurations with particles at every site of U and V , respectively.
We let K be the transition kernel of the discrete-time version of the Markov chain, and γ = (1 + λ) |U | + (1 +λ) |V | the Poisson rate for the continuous-time Markov chain.
The stationary distribution of the Markov chain is where x U = x ∩ U and x V = x ∩ V are the restrictions of the configuration x to U and V , respectively, and Z is the normalising constant. This has the asymptotic form The conductance between two configurations x, y ∈ X is given by when x and y differ at a single site, and 0 otherwise. A transition between two distinct configurations x to y occurs by adding or removing a particle. We denote a transition corresponding to adding a particle by x +V − − → y or x +U − − → y, depending on whether the particle is added to V or to U . If we do not want to emphasise where the new particle is placed, then we simply write x + −→ y. Transitions corresponding to removing a particle are denoted accordingly by x In the asymptotic regime λ → ∞, the configuration v is a stable state, in the sense that it is recurrent on any time scale (see Section B.1), as long as |U | < (1 + α) |V |. Our aim is to describe the transition from u to v, at least for some characteristic choices of the underlying graph. Let J(a) and J − (a) denote the set of states whose stationary probabilities are asymptotically at least as large as, respectively, asymptotically larger than the stationary probability of a (see (B.1)). We need to (i) identify Ψ u, J(u) , the critical resistance between u and J(u) (see (A.15)), (ii) verify that the Markov chain has no trap state, i.e., every configuration x / ∈ {u, v} satisfies π(x)Ψ x, J − (x) ≺ π(u)Ψ u, J(u) as λ → ∞, (iii) identify a critical gate between u and J(u) (see Section B.5).
Item (ii), together with Corollary B.7, shows the exponentiality of the distribution of the transition time from u to v on the time scale π(u)Ψ(u, v). Items (i-iii), together with Corollary B.4 and Propositions B.12-B.13, lead to a sharp asymptotic estimate for the expected transition time and the identification of the shape of the critical droplets.

Simple examples
Example 2.1 (Complete bipartite graph). The most pronounced example of metastability of the hard-core process occurs when the underlying graph is a complete bipartite EJP 23 (2018), paper 97. graph K m,n , i.e., |U | = m and |V | = n, and every site in U is connected by an edge to every site in V (Fig. 3a). The configuration space is X = {0, 1} U ∪ {0, 1} V . We assume that m ≤ (1 + α)n to make sure that the configuration v is a stable state, in particular, v ∈ J(u). Note that every path from u to v has a transition from a configuration with a single particle on U and no particle on V to the empty configuration ∅. Such a transition has the largest resistance γ λπ(∅) = γZλ −1 . Therefore the critical resistance between u and v is Ψ(u, v) = γ λπ(∅) . On the other hand, from any other configuration x / ∈ {u, v} it is possible to add a new particle, which means that Ψ(x, J − (x)) γ λπ(x) . Therefore i.e., the chain has no trap. In particular, with R(u ↔ v) the effective resistance between u and v (see Appendix A.1), The effective resistance can now be accurately estimated by identifying the critical gate between u and v (Proposition B.12), but for the sake of exposition, let us estimate it by direct calculation. This is possible because of the high degree of symmetry in the graph. Let W be the voltage when u is connected to a unit voltage source and v is connected to the ground. By symmetry, all the configurations with i = 0 particles on U have the same voltage. Therefore, by the short-circuit principle, we can identify them with a single node, which we call U i . Similarly, we can contract all the configurations with j = 0 particles on V with a single node V j . We then obtain a new network with nodes where U i is connected to U i−1 by a resistor with conductance and, similarly, V j is connected to V j−1 by a resistor with conductance We now have, by the series law, Z γ j n j λj . (2.9) As λ → ∞, the dominant term is i = 1 (corresponding to removal of the last particle from U ). Hence, Alternatively, it is easy to see that if we let Q be the set of all configurations that have a single particle on U and Q * {∅}, then [Q, Q * ] (the set of probable transitions between Q and Q * defined in (B.9)) is a critical gate between u and v, and we obtain (Proposition B.12) that (2.11) where C (u ↔ v) is the effective conductance between u and v (see Appendix A.1).
In conclusion,  The critical transition when going from u to v in an optimal path is between a configuration with a single particle missing from a site in U and a configuration with two particles missing from two consecutive sites in U . After that, the Markov chain can go "downhill" by adding a particle to the freed site in V and continue alternating between moves −U and +V until the stable configuration v is reached. Thus, if Q is the set of configurations with a particle missing from a single site in U and Q * is the set of configurations with particles missing from two consecutive sites in U , the critical gate is [Q, which gives in the discrete-time and continuous-time setting, respectively. The hitting times T v and T v are again asymptotically exponentially distributed, and the Markov chain undergoes a rapid transition when going from u to v. Furthermore, the chain goes almost surely To see that the chain has no trap, we note that any configuration in J(u) must have at least one particle on V . Thus from a configuration x ∈ J(u) \ {v}, it is either possible to add a new particle on V or first remove a particle from U and then add a new particle on V , so that π(x)Ψ(x, J − (x)) γ as λ → ∞. illustrates a phenomenon that is not present in the other examples considered in this paper. Namely, the condition of absence of traps is not satisfied. As a result, the scaled crossover time from u to v does not converge to an exponential random variable but to the sum of n independent exponential random variables.
Indeed, consider the continuous-time process and assume that λ is very large. Starting from u, it takes a rate 1 exponential time for each particle on U to be removed.
Once a particle is removed, it is quickly replaced by another particle in a time that is o(1), so that at an overwhelming majority of times the system is at a maximally packed configuration. If the particle is removed from any site other than 2n − 2, then the new particle arrives necessarily at the same position, while if the particle is removed from site 2n − 2, then the replacing particle arrives with probability 1 − o(1) at site 2n − 1.
In the next stage, after a time with an approximate exponential distribution, a particle is removed from site 2n − 4 and is replaced with a particle at site 2n − 3. In the same fashion, after n such replacements, the Markov chain reaches configuration v. Thus, in the limit λ → ∞, the crossover timeT v starting from u becomes a sum of n independent exponential random variables, each with rate 1.
Let us sketch how this can be made precise using the machinery of Appendices A-B.
For k ∈ {0, . . . , n − 1}, let q k denote the configuration with particles on {2i : i < 2(n − k)} ∪ {2i + 1 : i ≥ 2(n − k)}, and let q * k be the configuration obtained from q k by removing a particle from 2(n − k − 1). Observe that q 0 = u and set q n v. We can verify that Ψ q k , J(q k ) = r(q k , q * k ) = γ/π(q k ) and that ({q k }, {q * k }) is a critical pair between q k and J(q k ). Therefore, Corollary B.4 and Proposition B.12 imply that E q k [T J(q k ) ] = γ[1 + o(1)], and Corollary B.7 shows that, starting from q k , the hitting time T J(q k ) /γ is asymptotically exponentially distributed with rate 1. Proposition B.13 and the fact that K q * k , q k+1 = λ/γ = 1 − o(1) imply that P q k (T J(q k ) = T q k+1 ) = 1 − o(1). It follows that, as λ → ∞, the scaled crossover time T qn /γ converges in distribution to a sum of n independent exponential random variables with rate 1 corresponding to the segments T q k+1 − T q k . Example 2.4 (Path with even length and even endpoints). The hard-core process on a path with even length (Fig. 3d) has quite a different behaviour. Let U = {0, 2, . . . , 2n} and V = {1, 3, . . . , 2n − 1}, so both endpoints of the path belong to U . In this case, the trajectory from u to v is closer to the hard-core model on an even cycle (Example 2.2).
We similarly find that with an asymptotic exponential law forT v .

Example 2.5 (Even cyclic ladder).
Let the underlying graph be the cyclic ladder Z 2n ×Z 2 ( Fig. 4a) with U {(i, j) : i+j = 0 (mod 2)} and V {(i, j) : i+j = 1 (mod 2)}. Every site in the graph has three neighbours. Let Q be the set of configurations that are obtained from u by removing two particles from the neighbourhood of a site k ∈ V , and Q * the set of configurations that are obtained from u by removing three particles from the neighbourhood of a site k ∈ V . We may verify that [Q, Q * ] is a critical gate, and that EJP 23 (2018), paper 97.
the Markov chain has no trap. There are 6n possible transitions Q → Q * , each having resistance γZλ −(n−2) . It follows that state u undergoes a metastability transition with (2.16) and from u the distribution ofT v / E u [T v ] converges to an exponential random variable with unit rate. Furthermore, the transition occurs within a shorter period compared to 1 6n λ 2 and goes (with a probability tending to 1) through exactly one of the moves Q → Q * , each with probability 1 6n .
(a) A cyclic ladder (b) A doubled even cycle Figure 4: A doubled even cycle is isomorphic to a cyclic ladder.
Example 2.6 (Widom-Rowlinson on an even cycle). As discussed earlier, the Widom-Rowlinson model on a graph is equivalent to the hard-core model on the doubled version of that graph. This example reduces to Example 2.5 after we note that the doubled graph of a cycle Z 2n is isomorphic to a cyclic ladder (Fig. 4).
Note that in each of the above examples, the expected transition time E u [T v ] and the critical gate are independent of the parameter α. This is not consistent with the physical intuition of a critical droplet as a point of balance between the cost of removing particles from U and the gain of placing particles on V . Such physical intuition becomes the key to identifying the critical gate when the underlying graph has a more geometric structure. We will keep as our guiding example an even torus Z m × Z n .

Sophisticated examples
The problem of identifying the critical gate between u and v (or u and J(u)) gives rise to a combinatorial isoperimetric problem. The reason for the appearance of an isoperimetric problem can be intuitively understood as follows. When λ is large, the Markov chain tends to remain at configurations of particles that are close to maximal packing arrangements. Whenever one or more particles disappear from the graph, other particles quickly replace them, though potentially on different sites. Since the disappearance of particles is a much slower process, the typical trajectories tend to go through configurations that require the removal of the least possible number of particles. The system thus tends to make the transition from u to v by growing a droplet of closely-packed particles on V in such a way as to require the removal of less particles from U . In particular, near the bottleneck between u and v (i.e., close to the largest necessary deviation), the system typically goes through maximal packing configurations that are as efficient as possible, playing the role of critical droplet. Near the bottleneck, the system solves the optimisation problem of maximal packing with a constraint on the number of particles on V , i.e., the size of the critical droplet.
x ∈ X and |x V | = s} for s ∈ N. (2.17) Note that the stationary probability of a configuration x ∈ X with s |x V | can be written as which is bounded from above by as λ → ∞. We call ∆(A) and ∆(x) the isoperimetric cost of A and x. The (bipartite) isoperimetric problem asks for the sets A of fixed cardinality that minimise the cost Let us also introduce some terminology to describe evolutions of subsets of V .
A nested isoperimetric progression from A 0 = ∅ to A n is associated with a sequence a 1 , a 2 , . . . , a n of distinct elements in V with A k {a 1 , a 2 , . . . , a k }. We call such a sequence an isoperimetric numbering of (some) elements of V .
The relevance of the isoperimetric problem will be further clarified in the following sections. For now, we mention four non-trivial examples of graphs for which we know (partial) solutions for the isoperimetric problem.

Example 2.7 (Even torus).
Rather than the isoperimetric problem on the torus Z m ×Z n , we describe the solutions of the isoperimetric problem on the infinite lattice Z × Z. These solutions would be valid for the torus as long as the sets that we are considering are small enough that they cannot wrap around the torus. The solutions are obtained via reduction to the standard edge isoperimetric problem whose solutions are well known [29,2]. The argument for the reduction is given in Section 5.1.1. 20) ∆( ( + 1) + j) = 2( + 1) + 1 for ≥ 0 and 0 < j ≤ + 1, (2.21) and ∆(0) = 0, which can also be written in a concise algebraic form

The lattice Z×Z with the nearest neighbour edges is bipartite with
for s > 0. The optimal sets A realising ∆(|A|) are the following: • A set A ⊆ V with |A| = ( + 1) + j with 0 < j ≤ is optimal if and only if it consists of a tilted × ( + 1) rectangle plus a row of j elements along one of the four sides of the rectangle (see Fig. 5c, Eq. (2.21) and Sec. 5.1.1).
We point out that some of the optimal sets described above can be generated by suitable isoperimetric numberings. Indeed, if we number the elements of V in an spiral fashion as in Fig. 6a, then every initial segment of this numbering is an optimal set. Note, however, that some optimal sets will not be captured by such a numbering. For instance, the example in Fig. 6b cannot be extended to an optimal set one element larger.   Consider the doubled lattice, which is a bipartite graph with parts U Z×Z×{r} and In particular, the bipartite isoperimetric cost of a set A × {b} is simply |N (A) \ A|, which is the size of the vertex boundary of A in Z × Z. This is indeed the case for every doubled graph (Observation 5.3). It follows that the bipartite isoperimetric problem on the doubled lattice is equivalent to the vertex isoperimetric problem on the lattice. The vertex isoperimetric problem on the lattice has been addressed by Wang and Wang [47], who found optimal sets of every cardinality. Their solutions are given by an isoperimetric numbering that identifies an infinite nested family of optimal sets. Fig. 7a illustrates an isoperimetric numbering similar to but somewhat different from that of Wang and Wang.
The isoperimetric function s → ∆(s) on the doubled lattice can now be given by (2.23) and ∆(0) = 0. Note that every positive integer can be written in a unique way as Characterising all the optimal sets is more complicated. Vainsencher and Bruckstein [46] have obtained a characterisation of the optimal sets with certain cardinalities,

Example 2.9 (Tree-like regular graphs and their doubled graphs). Consider a
d-regular graph G in which every cycle has length at least , where d ≥ 2 and is large. Such a graph locally looks like a tree, in particular, every ball of radius r < /2 in G induces a tree.
First, suppose that G is bipartite with two parts U and V . If G were an infinite d-regular tree, then every non-empty finite set A ⊆ V would satisfy |N (A)| ≥ (d−1) |A|+1 with equality if and only if A ∪ N (A) is connected. This follows by induction or by a double counting argument. The same holds for a finite tree-like regular graph as long as Next, let us consider the isoperimetric problem on the doubled graph G [2] with U V (G) × {r} and V V (G) × {b}. In this case, we can easily verify that every with equality if and only if A is connected in G. In particular, ∆(s) = (d − 2)s + 2 for 0 < s < − 1. An isoperimetric numbering of length − 2 is obtained by any Example 2.10 (Hypercube and doubled hypercube). The d-dimensional hypercube is a graph H d whose vertices are the binary words w ∈ {0, 1} d and in which two vertices a and b are connected by an edge if they disagree at exactly one coordinate, i.e., if their Hamming distance is 1. The bipartite isoperimetric problem on the doubled graph H The hypercube H d itself is bipartite with U {w : w = 0 (mod 2)} and V {w : w = 1 (mod 2)}, where w denotes the number of 1s in w. It is interesting to note that the doubled hypercube H [2] d is isomorphic to the (d + 1)-dimensional hypercube H d+1 (Observation 5.4). Therefore, the solution of the vertex isoperimetric problem on hypercubes of arbitrary dimension also solves the bipartite isoperimetric problem on hypercubes. If A ⊆ V (H d ) is an optimal set for the vertex isoperimetric problem on H d , then the setÂ {wa : w ∈ A and wa = 1 (mod 2)} is optimal for the bipartite isoperimetric problem on H d+1 and vice versa.
For the vertex isoperimetric problem on H d , Harper [30] provided an isoperimetric numbering of the entire graph (see also Bezrukov [8], Harper [31]). This numbering is obtained by ordering the elements of {0, 1} d first according to the number of 1s, and then according to the reverse lexicographic order among the words with the same number of 1s. More specifically, the vertices of H d are numbered according to the total order , where w w when w < w , or when w = w and there is a k ∈ {1, 2, . . . , d} such that w i = w i for i < k and w k = 1 and w k = 0. Bezrukov [7] has obtained a characterisation of the optimal sets of some but not all cardinalities.
where ∆ d+1 denotes the bipartite isoperimetric cost in H d+1 , or equivalently, the vertex isoperimetric cost in H d . In Section 5.2.2, we will derive a recursive expression for the value of ∆ d+1 (s) for general s.

Preparation for sophisticated examples
Before we proceed with the 'sophisticated examples' of Section 2.3, we need some further preparation. One advantage of working with bipartite graphs is that there is a natural ordering on the configuration space (Section 3.1). We exploit this ordering to identify the critical resistance (Section 3.2), and prove the absence of trap states (Section 3.3) under certain assumptions on the solutions of the isoperimetric problem. The identification of the critical gate requires a detailed combinatorial analysis of the configurations close to the critical droplet (Sections 3.4-3.5). At various places we add pointers to definitions collected in Appendices A-B.

Ordering and correlations
An advantage of working with bipartite graphs is that the space of valid hard-core configurations on a bipartite graph admits a natural partial ordering. The transition kernel of the hard-core process is monotone with respect to this ordering and its unique stationary distribution is positively associated. Furthermore, two hard-core processes whose parameters satisfy appropriate inequalities can be coupled in such a way as to EJP 23 (2018), paper 97. ensure that one always dominates the other. This ordering has earlier been exploited in the equilibrium setting by van den Berg and Steif [5].
For two configurations x, y ∈ X , we write x y if x U ⊇ y U and x V ⊆ y V . The relation is a partial order and turns X into a lattice. The supremum x ∨ y and infimum x ∧ y of two configurations x, y ∈ X are given by For every two finite sets A, B we clearly have It follows that the stationary distribution of the hard-core process satisfies By the theorem of Fortuin, Kasteleyn and Ginibre (see e.g. Grimmett [28, Section 4.2]), the above condition guarantees that π is positively associated, i.e., π(A ∩ B) ≥ π(A)π(B) for every two increasing events A, B ⊆ X . We will, however, use the condition in (3.3) directly.
The monotonicity of the transition kernel K can be seen via a direct coupling: given two configurations x, x ∈ X where x x , it is easy (e.g. via the construction described in Section 1.2) to construct two copies of the Markov chain {X(n)} n∈N and {X (n)} n∈N with X(0) = x and X (0) = x such that almost surely X(n) X (n) for all n ∈ N.
Let us mention an extension of the latter observation that we will need in follow-up work. Let (λ 1 ,λ 1 ) and (λ 2 ,λ 2 ) be two choices for the activity parameters of the sites in U and V , and assume that λ 1 ≥ λ 2 andλ 1 ≤λ 2 . Given x (1) , x (2) ∈ X satisfying x (1) x (2) , we can construct a coupling {(X (1) (t),X (2) (t))} t∈[0,∞) of the continuous-time hard-core processes with parameters (λ 1 ,λ 1 ) and (λ 2 ,λ 2 ), respectively, in such a way that almost surelyX (1) . Namely, we use the same clocks ξ d k for the death of particles in both systems and we couple the birth clocks ξ b,1 k and ξ b,2 k used for

Paths and progressions
Heuristically, we expect the transition from u to v to happen through the formation and growth of a droplet of particles on V . Such a growth process can be described by a progression from ∅ to V .
Progressions correspond to paths in the configuration space X in a natural way.
We call this progression the trace of ω on V . Conversely, given a progression A 0 , A 1 , . . . , A m , we can construct a path ω in the following fashion (see Fig. 8). The path ω consists of segments corresponding to transitions A i−1 → A i for i = 1, 2, . . . , m. At the beginning of the segment corresponding to A i−1 → A i , the path is at the configuration with particles on A i−1 and U \ N (A i−1 ). If A i−1 A i , the path then proceeds by removing particles one by one from the neighbours of the unique site a i ∈ A i \ A i−1 and then placing a particle at a i . If A i−1 A i , the path ω does the reverse: it first removes the particle that is on the unique site a i ∈ A i−1 \ A i and then places particles on the neighbours of a i , one after another. Observe that the trace of the path ω thus obtained is precisely the progression A 0 , A 1 , . . . , A m . In particular, there are indices 0 = k 0 < k 1 < · · · < k m = n such that ω V (k i ) = A i and ω U (k i ) = U \ N (A i ). We call the sequence ω(k 0 ), ω(k 1 ), . . . , ω(k m ) the backbone of ω.  The path associated to a progression is locally optimal, in the sense that the critical When the progression is isoperimetric, the critical resistance of the associated path has a sharp upper bound in terms of the isoperimetric function ∆(s).

Lemma 3.1 (Critical resistance of an isoperimetric progression). Let
A m be an isoperimetric progression, and set s min where s † is a maximiser of the function g(s) ∆(s) − α(s − 1) over the set {s min + 1, s min + 2, . . . , s max }. Furthermore, the equality holds provided the progression is nested and See Appendix C.9 for the proof.
We say that a path ω = ω(0) → ω(1) → · · · → ω(n) is monotone when ω(i) ω(i + 1) for each i, in other words, when ω consists only of transitions of the type −U (i.e., removing of a particle from U ) and +V (i.e., adding a particle to V ). Observe that the trace of a monotone path is a nested progression. Conversely, the path associated to a nested progression is monotone. We call the path associated to a nested isoperimetric progression a standard path. Clearly, the configurations in the backbone of a standard path are isoperimetrically optimal. Moreover, every configuration x on a standard path that is not part of the backbone satisfies ∆(s) ≤ ∆(x) ≤ ∆(s + 1) + 1 where s |x V |. An argument for the following lemma can be found in Appendix C. 9.

Lemma 3.2 (Optimality of standard paths). Every standard path is optimal.
Assuming the existence of sufficiently long isoperimetric numberings, Lemmas 3.1 and 3.2 can be combined to identify the critical resistance between u and J(u).

Proposition 3.3 (Identification of the critical resistance).
Lets > 0 be an integer such that ∆(s) ≤ αs, and let s * be a maximiser of the function g(s) ∆(s) − α(s − 1) over the set {1, . . . ,s}. Suppose that an isoperimetric numbering of at leasts vertices in V exists. Then the critical resistance between u and J(u) is given by

Absence of traps
In this section we provide a general condition for the absence of traps (i.e., π(x)Ψ(x, J − (x)) ≺ π(u)Ψ(u, J(u)) for every x ∈ X \ {u, v}). The argument provided in Appendix C.10 is an adaptation of the one for Glauber dynamics of the Ising model (see Bovier and den Hollander [14, Section 17. 3.1]), and crucially relies on the presence of a partial ordering on the configuration space with respect to which the stationary distribution satisfies the FKG condition (3.3). Although the following proposition does not cover all the possible cases, it is simple and requires only a simple assumption.

Proposition 3.4 (Absence of traps).
Assume that |U | < (1 + α) |V |. Suppose further that, for every j ∈ V , there is a standard path ω : u J(u) such that the first particle that ω places on V is at j. Then every configuration The hypothesis of Proposition 3.4 can be rewritten in terms of isoperimetric numberings, hence providing an isoperimetric criterion for the absence of traps.

Critical gate and progressions
Once we establish the absence of traps, we can use Corollary B.4 to write the mean crossover time E u [T v ] in terms of the effective resistance R(u ↔ J(u)). As we saw in Proposition B.12, a sharp estimate for the effective resistance R(u ↔ J(u)) can be obtained if we are able to identify the critical gate between u and J(u).
The purpose of hypothesis (H4) in Section 1.3 was to describe the critical gate between u and J(u) in terms of the isoperimetric properties of the underlying graph.
The following proposition clarifies this connection and is verified in Appendix C. 12.

Optimal paths close to the bottleneck
In order to identify the critical gate between u and J(u), we need an understanding of the optimal paths from u to J(u) at and around the bottleneck. In this section, we demonstrate that the configurations close to the bottleneck in every such optimal path have to be almost isoperimetrically optimal. We state the lemmas in general setting, but the reader should keep the even torus (Example 2.7) as a guiding example.
We assume that there is a standard path between u and J(u), and we let s * be as in for s ∈ N. We verify that, near the bottleneck, every basic step of an optimal path is through an isoperimetrically optimal configuration.
The following three lemmas indicate the isoperimetric optimality of basic configurations in an optimal path u J(u) when it passes the bottleneck. The proofs can be found in Appendix C. 11.

Lemma 3.7 (Optimality close the bottleneck).
Let ω : u J(u) be an arbitrary optimal path, and let x be a basic configuration in ω with s particles on V . Suppose that d∆(s) + ε ≥ α(ds + 1) for some ε ≥ 0. Then x is isoperimetrically ε-optimal. In particular, x is optimal when s < s * + 1 /α − 1 and ∆(s) ≥ ∆(s * ). Let ω : u J(u) be an arbitrary optimal path, and let x be the first configuration in ω that has s + 1 particles on V . Suppose that ∆(s + 1) ≥ ∆(s) and d∆(s) + ε ≥ α(ds + 1) for some ε ≥ 0. Then x is isoperimetrically ε-optimal. In particular, x is optimal when s < s * + 1 /α − 1 and Let t * |U | − s * − ∆(s * ) denote the number of particles on U in an isoperimetrically optimal configuration that has s * particles on V .

Lemma 3.9 (Optimality close the bottleneck).
Let ω : u J(u) be an arbitrary optimal path and assume that s * ≥ 2 and ∆(s * ) = ∆(s * − 1) + δ for some δ ≥ 0. Let ω(q) be a basic configuration in ω with at least s * particles on V . Let ω(p) (with p < q) be the last basic configuration before ω(q) with less than s * − 1 particles on V . Then the next basic configuration after ω(p) has s * − 1 particles on V and at least t * + 2 particles on U . In particular, it is isoperimetrically (δ − 1)-optimal.
The next proposition combines the above three lemmas to describe an isoperimetric constraint on the optimal paths u J(u), which in some cases will help us identify the critical gate. See Fig. 9 for an illustration. There are no basic configurations beyond the dashed line.

Proposition 3.10 (Constraint on optimal paths). Assume that hypotheses (H1) and
(H3) are satisfied. Let κ be an integer satisfying 0 ≤ κ < 1 /α. (For instance, we can take See Appendix C.11 for the proof. As we saw in Proposition 3.6, finding two families A, B satisfying hypothesis (H4) of Section 1.3 allows us to identify the critical gate between u and J(u). With the help of EJP 23 (2018), paper 97. Proposition 3.10, we can replace hypothesis (H4) with hypotheses (H5) and (H6) and prove Proposition 1. 4. See Appendix C.12 for the proof of Proposition 1. 4. 4 Proof of the three metastability theorems 4 The assumption of absence of traps used in Corollary B.4 follows from Corollary 3.5 and hypotheses (H0) and (H2) The claim follows.  (H0) and (H2). To see that the other assumption π(u)Ψ u, J(u) 1 holds, recall that the underlying graph is assumed to be connected. Therefore, the first move of every path ω : u J(u) is of the type −U (i.e., removing a particle from U ) and

Critical gate
Proof of Theorem 1.3.
(i) As in the proof of Theorem To estimate R(u ↔ J(u)), we identify a critical gate between {u} and J(u) and apply Proposition B. 12. Since conditions (H1), (H3) and (H4) are satisfied, Proposition 3.6 implies that the sets Q and Q * form a critical pair between {u} and J(u). Therefore where c(Q, Q * ) x∈Q y∈Q * x∼y c(x, y). On the other hand, whenever x ∈ Q and y ∈ Q * and x ∼ y, the configuration y is obtained from x by removing a particle from U , and furthermore, y V = A and y U = U \N (B) for some A ∈ A and B ∈ B with |B \ A| = 1.
(ii) We apply Proposition B.13 with a u and B {v}. From Proposition 3.6 and using (H1), (H3) and (H4), we know that (Q, Q * ) is a critical pair between {u} and J(u). Corollary 3.5 and hypotheses (H0) and (H2) imply the absence of traps. Observe that in absence of traps, a critical pair between {u} and J(u) is also a critical pair between {u} and {v}. The result now follows after we observe from (4.6) that for all pairs x ∈ Q and y ∈ Q * with x ∼ y, the conductance c(x, y) has the same value.

Sophisticated examples: the isoperimetric problem
The bipartite isoperimetric problem introduced in Section 2.3 belongs to a general class of combinatorial isoperimetric problems. An isoperimetric problem on a graph asks for a set of vertices with a given cardinality that has the smallest boundary. Depending on how we measure the size of the boundary of a set (called the isoperimetric cost ), we get various versions of the isoperimetric problem. In this section, we study the bipartite isoperimetric problem for the examples of graphs considered in Section 2.3 by reducing the problem to classical isoperimetric problems for which more information is available. In Section 5.1, we derive the solutions of the bipartite isoperimetric problem on the torus by reducing it to the edge isoperimetric problem. In Section 5.2, we study cases in which the bipartite isoperimetric problem can be reduced to the vertex isoperimetric problem.

Even torus
The aim of this section is to derive the solutions of the bipartite isoperimetric problem on an even torus, which are described in Example 2. 7. For simplicity, we first consider the bipartite isoperimetric problem on the infinite lattice Z × Z. We follow the approach of den Hollander, Nardi and Troiani [34] to reduce the problem to the standard edge isoperimetric problem on the lattice. The edge isoperimetric problem on the two-dimensional square lattice was solved by Harary and Harborth [29], and later independently (and more completely) by Alonso and Cerf [2].
Let us start by recalling the edge isoperimetric problem on graphs. Consider a locally finite graph G. The edge boundary of a set A ⊆ V (G), denoted by ∂A, is the set of edges between A and its complement. The edge isoperimetric problem on G is the isoperimetric problem in which |∂A| is counted as the isoperimetric cost of A. Now, let G be bipartite with parts U and V , and assume that G is r-regular. For a finite set A ⊆ V , we get the identity by counting the edges incident to N (A) in two ways (recall that N (A) is the set of edges in U with a neighbour in A ⊆ V ). As a result, we get the following convenient representation of the bipartite isoperimetric cost (see Fig. 10a).     It can be verified by direct inspection that every non-empty set A ⊆ V that is optimal with respect to the edge boundary in L satisfies |N 1 (A)| − 2 |N 1010 (A)| − |N 3 (A)| = 4. We claim that the same equality holds when A is optimal with respect to the bipartite isoperimetric cost ∆. A proof of the above lemma can be found in Appendix C. 13.
In conclusion, we have the equality for every non-empty A ⊆ V that is optimal either with respect to the edge boundary in L or with respect to the bipartite isoperimetric cost ∆. It follows that the solutions of the bipartite isoperimetric problem on the lattice Z × Z coincide with the solutions of the edge isoperimetric problem on the lattice L. The edge boundary of an optimal set with s vertices has size 2 2 √ s and the optimal sets in L are those described in Example 2.7. Thus, ∆(s) = 2 √ s + 1 for s > 0 and the optimal sets with respect to ∆ are as described in Example 2.7.
Finally, we argue that the solutions of the bipartite isoperimetric problem on an even torus Z m × Z n are the same (modulo translations) as the solutions for the infinite lattice Z × Z as long as the size of the set is small compared to m and n. To see why, it is enough to note that if A has less than 1 4 min{m, n} vertices, then it cannot "sense" the distinction between Z m × Z n and Z × Z. More precisely, let A be an optimal set in Z m × Z n with |A| < 1 4 min{m, n}. Then, the pre-image of A under the canonical projection from Z×Z to Z m × Z n can be partitioned into countably many sets A i (for i ∈ Z × Z) such that each A i is a translated copy of A and the sets A i ∪ N (A i ) are disjoint. In particular, that |A i | = |A| and ∆(A i ) = ∆(A). Conversely, if A is an optimal set in Z × Z with |A| < 1 4 min{m, n}, then A ∪ N (A ) is connected (see the proof of Lemma 5.2). It is easy to see that the canonical projection of Z × Z onto Z m × Z n maps every connected set with less than min{m, n} elements injectively. In particular, if A denotes the projection of an optimal set A , then A is a translated copy of A and we have |A| = |A | and ∆(A) = ∆(A ).

Reduction to vertex isoperimetry
In the vertex isoperimetric problem, the size of the boundary of a set A is measured as |N (A) \ A|. The bipartite isoperimetric problem on a doubled graph G [2] is equivalent to the vertex isoperimetric problem on the original graph G.  The doubled version of a non-bipartite graph is similar, except that it has a "Möbius twist" along each odd cycle (see Fig. 2 and Fig. 4). EJP 23 (2018), paper 97.

Doubled torus
According to Observation 5.3, the bipartite isoperimetric problem on a doubled torus is equivalent to the vertex isoperimetric problem on a torus. Since we will be concerned only with sets that are small in comparison with the dimensions of the torus, we may consider the infinite lattice Z × Z instead. As mentioned in Example 2.8, Wang and Wang [47] have produced an isoperimetric numbering for the vertex isoperimetric problem on Z × Z. Vainsencher and Bruckstein [46] have provided a characterisation of the optimal sets of certain critical cardinalities. A complete characterisation of the optimal sets for the remaining cardinalities is beyond the scope of this paper. In this section, we propose a conjecture that, if true, will allow us to obtain sharp asymptotics for the metastable transition in the Widom-Rowlinson model on a torus.
Every positive integer s has a unique representation s = 2 + ( − 1) 2 + r where > 0 and 0 ≤ r < 4 . Note that 4 = ( − 1) + + + ( + 1). We call a number s = 2 + ( − 1) 2 + r critical if r ∈ {0, − 1, 2 − 1, 3 − 1}. Observe from (2.23) that ∆(s) is non-decreasing with ∆(s + 1) > ∆(s) if and only if s is a critical cardinality. It follows that an optimal set A has a critical cardinality if and only if it is also co-optimal, meaning that it has maximum cardinality among all sets B with ∆(B) = ∆(A). A set that is both optimal and co-optimal is called Pareto optimal.
For A ⊆ Z × Z and k ≥ 0, let N k (A) denote the set of sites within graph distance k from A, i.e., the ball of radius k around A. Vainsencher and Bruckstein [46] have shown that a non-empty set is Pareto optimal if and only if it has the form N k (S) for k ≥ 0 and a set S that is obtained by translation and rotation from one of the basic forms in Fig. 11a. We call the set S the seed of N k (S).

I II IIIa
IIIb IV (a) The seeds generating the Pareto optimal sets (up to rotations and translations).
Examples of sets generated from the seeds and their cardinalities. Figure 11: Every Pareto optimal set (i.e., an optimal set with a crtical cardinality) on the lattice is generated by a seed.
Pareto optimal sets of consecutive types can be connected via nested isoperimetric progressions.

Observation 5.5 (Existence of connecting progressions).
(a) Let S and S be seeds of type I and II of Fig. 11a, respectively, and suppose that N (S) ⊆ S . Then, for every ≥ 2, there is a nested isoperimetric progression from N −1 (S) to N −2 (S ).
(b) Let S and S be seeds of type II and III of Fig. 11a, respectively, and suppose that S ⊆ N (S ). Then, for every ≥ 2, there is a nested isoperimetric progression from N −2 (S) to N −1 (S ).
(c) Let S and S be seeds of type III and IV of Fig. 11a, respectively, and suppose that S ⊆ S . Then, for every ≥ 1, there is a nested isoperimetric progression from N −1 (S) to N −1 (S ).
(d) Let S and S be seeds of type IV and I of Fig. 11a, respectively, and suppose that S ⊆ N (S ). Then, for every ≥ 1, there is a nested isoperimetric progression from N −1 (S) to N (S ).
As an immediate consequence, we find that Pareto optimal sets are achieved via isoperimetric numberings.
In order to identify the critical gate for the Widom-Rowlinson model on a torus, we will also need some information about all isoperimetric progressions connecting Pareto optimal sets of consecutive types. This requires a better understanding of the optimal sets with non-critical cardinalities, which we do not have. Nonetheless, we make the following conjecture. (a) Let B 0 , B 1 , . . . , B n be an isoperimetric progression with |B 0 | = 2 + ( − 1) 2 + − 1 and |B n | = 2 + ( − 1) 2 + 2 − 1 and |B 0 | < |B i | < |B n | for 0 < i < n. Let S 0 be the seed of B 0 and S n the seed of B n , so that B 0 = N −2 (S 0 ) and B n = N −1 (S n ). Then, S 0 ⊆ N (S n ) and B 0 ⊆ B 1 ⊆ B n .

Hypercube
According to Observations 5. 3 where ψ d (r, k) satisfies the recursion The proof can be found in Appendix C. 13.

Sophisticated examples: key results
After having collected in Section 3 the relevant tools, we are now ready to apply our results to the 'sophisticated examples' in Section 2.3: torus, doubled torus, regular tree-like graph, hypercube. In the case of the torus where a complete solution of the isoperimetric problem is known, we obtain a complete picture of the metastable transition from u to v. In the case of the doubled torus, the complete picture relies on the validity of Conjecture 5. 7. In other cases we obtain an incomplete picture that is still informative.

Hard-core on an even torus
In this section, we combine our results to give a description of the metastable transition of the hard-core dynamics on an even torus Z m × Z n . We assume 0 < α < 1, 2 /α / ∈ Z and m, n 1 /α.  A more accurate estimate on the mean crossover time as well as a description of the critical droplet is provided by Theorem 1.3, which relies on hypotheses (H3) and (H4). Hypothesis (H3) is already verified in Lemma 6. 1 In order to identify the family B, recall from Example 2.7 that each isoperimetrically optimal set B with |B| = s * = ( * − 1) * + 1 consists of an element of A ∈ A (i.e., an ( * −1)× * tilted rectangle) and an extra site b along one of the four sides of the rectangle (see Fig. 5c). Observe that if b is along a longer edge of A, then B can be extended via a nested isoperimetric progression to an element of C (i.e., an * × * tilted rectangle), whereas if b is along a shorter edge of A, then every isoperimetric progression from B to C must pass through A. Therefore, • B consists precisely of tilted ( * − 1) × * rectangles plus an extra element along one of the two longer sides of the rectangle.
A typical transition through the critical gate [Q, Q * ] is depicted in Figure 12. Figure 12: A typical transition through the critical gate for the torus. The critical length * is assumed to be 6. The 'hole' is along one of the two long edges of the rectangle. Once a two-site 'hole' is produced, with probability close to 1 a (blue) particle appears very quickly in the opened-up space.
for the expected crossover time.

Widom-Rowlinson on a torus
As observed in Section 1.2, the Widom-Rowlinson dynamics on the torus Z m × Z n is equivalent to the hard-core dynamics on the doubled torus. We assume that 0 < α < 1, 4 /α / ∈ Z and m, n 1 /α. The isoperimetric function ∆(s) on the doubled is provided in Example 2.8, using the equivalence of the bipartite isoperimetric problem on a doubled torus and the vertex isoperimetric problem on the torus and the known result about the vertex isoperimetric problem on the torus. We shall obtain the exponentiality of the distribution of the crossover time and the order of magnitude of its expected value. A sharp asymptotic for the expected crossover time and a description of the critical droplet are obtained assuming Conjecture 5.7 regarding the solutions of the vertex isoperimetric problem on Z × Z is true.
The proof of the following lemma appears in Appendix C.14.
As in the previous section, finding the exact value of resettling sizes (i.e., the smallest s for which ∆(s) ≤ αs) is not necessary. It is sufficient to observe thats exists (the inequality is achieved for instance for s > ( 2 /α + 1) 2 + ( 2 /α) 2 − 1) and is independent of m and n (as long as m, n 1 /α).  (6.5) for its mean, where * [ 1 /α] is the closest integer to 1 /α.
• C consists precisely of sets N * −1 (S ) where S is a seed of type III.
Conditions (H6.a) and (H6.b) follow from Observation 5. 6. Assuming Conjecture 5.7 is true, and using Observation 5.5, we obtain a characterisation of B.
• B consists precisely the sets B with |B| = s * such that N * −2 (S) ⊆ B ⊆ N * −1 (S ) for some seeds S and S of type II and III where S ⊆ N (S ).
for the expected crossover time.
for the expected crossover time.

Graph girth and crossover time
In Example 2.9, we noted that the optimal isoperimetric cost in a regular bipartite graph with large girth grows linearly for small cardinalities. Likewise, the optimal isoperimetric cost in a doubled version of a bipartite graph with large girth is linear when restricted to small cardinalities. Since g(s) = ∆(s) − α(s − 1) has no critical point when ∆(s) is linear, we obtain lower bounds for the order of magnitude of the crossover time of the hard-core dynamics and Widom-Rowlinson dynamics on a (bipartite) regular graph in terms of the girth of the graph.  length * in both cases is assumed to be 4. The 'hole' can be anywhere in the highlighted region. Once a two-site 'hole' is produced, with probability close to 1 a blue particle appears very quickly in the opened-up space.
First, let us consider a d-regular bipartite graph in which the length of each cycle is at least . We know from Example 2.9 that ∆(s) = (d − 2)s + 1 for 0 < s < /2. Therefore, g(s) = ∆(s) − α(s − 1) = (d − 2 − α)s + 1 + α. If d = 2 (i.e., if the graph is a cycle), the critical size and the resettling size are s * =s = 1. The hypotheses (H0)-(H4) are trivially satisfied with A = {∅} and B = {{b} : b ∈ V } and |[Q, Q * ]| = 2 |V |. Therefore, in this case, we recover the result of Example 2.2. If, on the other hand, d > 2, the function g(s) is increasing for 0 < s < /2 and can achieve its maximum only at s ≥ /2 . While Theorem 1.1 is not applicable (condition (H1) may not be satisfied), direct application of Proposition B.2 and Lemmas 3.1-3.2 leads to the following lower bound for the expected crossover time.
where s /2 . Therefore, in this case, we recover the result of Example 2.6 even when the cycle is not EJP 23 (2018), paper 97. even. If, on the other hand, d > 2, the function g(s) is increasing for 0 < s < − 1 and can achieve its maximum only at s ≥ − 1. Therefore, we get a similar lower bound for the expected crossover time using Proposition B.2.

Proposition 6.4 (Lower bound for expected crossover time: Widom-Rowlinson).
Let G be a d-regular graph with d > 2 in which the length of each cycle is at least , and G [2] = (U, V, E) its doubled version. Then, the crossover time from u to v on G [2] satisfies where s − 1.

Hard-core and Widom-Rowlinson on a hypercube
As we saw in Example 2.10, the doubled version of a d-dimensional hypercube H d is isomorphic to a (d + 1)-dimensional hypercube H d+1 , hence the Widom-Rowlinson dynamics on H d is essentially the same as the hard-core dynamics on H d+1 . As before, we assume that 0 < α < 1.  If we further assume that α is irrational, then hypothesis (H3) will also be satisfied.

A Reversible Markov chains
A useful tool for studying reversible Markov chains is their analogy with electric networks and potential theory. This analogy has been exploited in various contexts, most notably for the recurrence/transience problem. The use of potential theory in the study of metastability is pioneered by Bovier, Eckhoff, Gayrard and Klein [12] and is developed in detail in the monograph by Bovier and den Hollander [14]. We start by recalling the relevant aspects of the connection between electric networks and reversible Markov chains, while fixing our notation and terminology (see Section A.1). Estimating the expected hitting time of a target set reduces via the above analogy to estimating the effective resistance between the starting point and the target as well as the voltage at different points of the network. Sharp estimates for effective resistance can be obtained using the machinery of the Nash-Williams inequalities (see Section A.2) or using the variational principles of Thomson and Dirichlet. A simpler estimate for effective resistance, capturing its order of magnitude, is given by "critical resistance", which is an abstract variant of the more standard notion of "communication height" often used in metastability theory (see Section A.3). Critical resistance can also be used to provide rough bounds for voltage (see Section A.4). EJP 23 (2018), paper 97.

A.1 Connection with electric networks
In this section we fix the general notation and terminology and recall a few relevant facts about reversible Markov chains and their analogy with electric networks. The proofs and the background could be found in various sources, e.g. Doyle and Snell [24], Levin, Peres and Wilmer [40], Grimmett [28], Lyons and Peres [41], Aldous and Fill [1], Bovier and den Hollander [14].
We let (X(n)) n∈N be a discrete-time Markov chain with finite state space X and transition matrix K : X × X → [0, 1]. We assume that K is irreducible and has a reversible stationary distribution π. We write P x and E x to denote probability and expectation conditioned on the event X(0) = x. The first hitting time of a set A ⊆ X is denoted by When we disregard the case X(0) ∈ A, we write The first passage time through a transition x → y is likewise denoted by T xy inf{n > 0 : X(n − 1) = x and X(n) = y} .
An analogy is made between the above reversible Markov chain and an electric network with nodes labelled by the elements of X in which node x is connected to node y by a resistor with conductance c(x, y) π(x)K(x, y) = π(y)K(y, x) (and resistance r(x, y) = 1/c(x, y) ∈ (0, ∞]). We write x ∼ y when c(x, y) > 0. The first basic connection between the two objects is that the function is the unique harmonic function with boundary conditions h| A ≡ 1 and h| B ≡ 0. Therefore P x (T A < T B ) coincides with the voltage W A,B (x) at node x if all the nodes in B are connected to the ground and all the nodes in A are connected to a unit voltage source.
The effective resistance and effective conductance between two sets A, B ⊆ X (which are standard quantities in electric network theory and capture what happens when all the vertices in A, respectively, B are wired) will be denoted by R(A ↔ B) and C (A ↔ B), respectively. An easy consequence of the above connection is the equality for every state a ∈ X and set B ⊆ X not containing a.
When T is a stopping time, we denote by G T (a, x) the expected number of visits to state x if the chain is started at state a and stopped at T , i.e., (A. 6) In case x = a, time 0 is also counted. The function G T is the Green function associated with T . The second basic connection between a reversible Markov chain and its corresponding electric network is an electric interpretation of the Green functions associated to hitting times. Namely, it can be shown, for a state a ∈ X and a set B ⊆ X not containing a, that the function h(x) G T B (a, x)/π(x) is harmonic with boundary conditions h| a ≡ R(a ↔ B) and h| B ≡ 0. Therefore G T B (a, x)/π(x) agrees with the EJP 23 (2018), paper 97. voltage at x provided all the nodes in B are connected to the ground and a is connected to a unit current source. It follows that where W a,B (x) = P x (T a < T B ) is the voltage at x when B is connected to the ground and a is connected to a unit voltage source. As an immediate corollary, we get the useful for every state a ∈ X and set B ⊆ X not containing a.
If t is a non-negative constant, then by reversibility we have the general identity π(x)G t (x, y) = π(y)G t (y, x).
This identity remains valid for Green functions associated with hitting times: for every two states x, y ∈ X and every set Z ⊆ X . A similar reciprocity law holds for hitting order probabilities: for every two states x, y ∈ X and every set Z ⊆ X .
The notion of projection for electric networks is much more relaxed than the notion of projection for Markov chains. Namely, identifying two nodes with the same voltage (i.e., making a short circuit between them) we do not affect the voltage at other nodes.
As a corollary, we have that the effective resistance R(A ↔ B) between two disjoint sets A, B ⊆ X remains unchanged when we contract A into a single node a and B into a single node b. This simplify some arguments.

A.2 Sharp bounds for effective resistance
The variational principles of Thomson and Dirichlet are the most common tools to obtain upper and lower bounds for effective resistance. An alternative combinatorial approach due to Nash-Williams often gives simple and useful estimates.
We consider a graph on the state set X whose edges are the pairs (x, y) with c(x, y) > 0. Let A, B ⊆ X be disjoint. A cut separating A from B is a set C ⊆ X such that A ⊆ C ⊆ B c . Given a cut C, we write ∂C {(x, y) : x ∈ C, y / ∈ C and c(x, y) > 0} for the set of edges between C and C c . The simplest form of the Nash-Williams inequality is the intuitive inequality for every cut C separating A from B. A dual (and equally intuitive) inequality .13) holds for every path ω from A to B. These two inequalities are special cases of the more general Nash-Williams inequalities, but can also be derived from the Dirichlet and the Thomson variational principles. While the above upper bound for effective conductance is sufficient for our purpose, we need a more accurate lower bound. The following extended version of the (dual) Nash-Williams inequality due to Berman and Konsowa [6] provides a method to obtain sharp lower bounds. EJP 23 (2018), paper 97.

Proposition A.1 (Extended dual Nash-Williams inequality). Let A, B ⊆ X . Let
(ω i ) i∈N be an arbitrary sequence of simple paths from A to B, with the property that no two paths ω i and ω j pass through a common edge in opposite directions. For each edge e, let n(e) denote the number of paths ω k that pass through e. Then . (A.14) The proof is similar to the proof of the standard Nash-Williams inequality, but for completeness, we include it in Appendix C. 1. We note that the latter inequality is sharp: by allowing repetitions in the sequence (ω i ) i∈N we get arbitrarily close lower bounds for the conductance C (A ↔ B).

A.3 Rough estimates for effective resistance
The order of magnitude of effective resistance is captured by the notion of "critical resistance", which is much easier to evaluate. We define the critical resistance between two sets A, B ⊆ X as where the infimum is taken over all paths (sequences of distinct states) connecting A to B, and the supremum is over all edges (pairs of consecutive states) on the path. For a path ω, we refer to Ψ(ω) sup e∈ω r(e) (A. 16) as the critical resistance of ω.
Critical resistance is closely related to the notion of communication height, which is often used in the study of metastability in Metropolis dynamics (see Olivieri and Vares [45], Bovier and den Hollander [14]). The two notions are connected via the (imprecise) correspondence Ψ(A, B) ≈ e βΦ(A,B) , where Φ(A, B) is the communication height between A and B and β is the inverse temperature. While somewhat less intuitive, the notion of critical resistance has two advantages. First, it is defined for individual Markov chains (rather than parametric families of Markov chains), and therefore can also be used in asymptotic regimes other than β → ∞, in particular, when there is no clear-cut notion of energy. Second, while the height of a path ω is often defined as the maximum energy of a state on ω, the maximisation in the critical resistance is taken over pairs of consecutive states on ω. As noted in Cirillo, Nardi and Sohier [20], this turns out to be the appropriate definition for general (non-Metropolis) Markov chains.
The effective resistance a, b → R(a ↔ b) defines a metric on X . The critical resistance, on the other hand, defines an ultra-metric on X : • Ψ(x, y) ≥ 0 with equality if and only if x = y, The following proposition shows that the two metrics a, b → R (a ↔ b) and a, b → Ψ(a, b) are equivalent up to constants depending only on the graph (and not on the resistances r). Its proof can be found in Appendix C.2. To understand the geometry of Ψ, let us recall two basic facts. First, every triangle in a general ultra-metric space is isosceles, with two equal sides and a third side that is no larger than the other two (i.e., the three sides can be ordered as a ≤ b = c). Second, suppose that T is a minimal spanning tree on X (where edge e is weighted by its resistance r(e)). Then, the Ψ-distance between two points a, b ∈ X is simply the maximal resistance of the unique path between a and b on T . In other words, every path on T is geodesic with respect to Ψ.

A.4 Rough estimates for voltage
In order to estimate the Green function via (A.7), we will also need rough estimates for the voltage. The following proposition corresponds to Bovier and den Hollander [14,Lemma 7.13(iii)]. Its proof can be found in Appendix C. 3.
is the voltage at x when B is connected to the ground and A is connected to a unit voltage source.
Using the inequalities between effective resistance and critical resistance (Proposition A.2), we obtain the following proposition as a corollary of the above two estimates.

Proposition A.4 (A priori estimate).
There is a constantk ≥ 1 such that, for every two disjoint sets A, B ⊆ X and every node x ∈ X \ (A ∪ B), (A. 19) The constantk can be chosen to be |X | 4 .
The following is a generalisation of the latter proposition. It expresses the intuition that small distance between two nodes implies small difference between their voltages. Its proof can be found in Appendix C. 3.

Proposition A.5 (A priori estimate).
There is a constantk ≥ 1 such that, for every two disjoint sets A, B ⊆ X and every two nodes x, y ∈ X , .20) The constantk can be chosen to be |X | 4 .

B Metastability in reversible Markov chains
In this section we discuss the metastable behaviour of reversible Markov chains in a certain asymptotic regime. Our treatment is based on Bovier and den Hollander [14,Chapters 7,8 and 16], although our exposition is somewhat different. In Section 2 we will specialize to hard-core dynamics.
We consider a one-parameter family of discrete-time irreducible Markov chains {X λ (t)} t∈N on a finite state space X with transition matrix K λ and reversible stationary distribution π λ . The parameter λ is assumed to be a real number. For hard-core dynamics, λ determines the activity parameter at each site. (For Glauber dynamics of the Ising model, λ would be the inverse temperature.) For brevity, we drop the subscript λ EJP 23 (2018), paper 97. from X λ (t), K λ and π λ . We focus on the asymptotic regime λ → ∞, where metastable phenomena are more prominent. We will use the following notation for asymptotics: For simplicity, we make a smoothness assumption. Namely, we assume that all the transition probabilities K(x, y) for different pairs (x, y) are asymptotically comparable, i.e., for every two pairs of states (x, y) and (x , y ), either K(x, y) ≺ K(x , y ) or K(x, y) K(x , y ) as λ → ∞, and for every two states x and y, either π(x) ≺ π(y) or π(x) π(y) as λ → ∞. These conditions are trivially satisfied for hard-core dynamics on a bipartite graph. For convenience, we also assume that the graph of probable transitions of K remains unchanged for all sufficiently large λ.
In Section B.1 we characterise metastabilty in terms of recurrence of metastable states. In Section B.2 we link the mean metastable transition time to the effective resistance of an associated electric network. In Section B.3 we explain the ubiquity of the exponential limit law for the metastable transition time divided by its mean. In Section B.4 we look at tail probabilities of the metastable transition time. In Section B.5 we derive a sharp asymptotics for the effective resistance. In Section B.6 we look at the passage through bottlenecks.

B.1 A characterisation of metastability
One way to formulate metastability (in the asymptotic regime λ → ∞) is in terms of the recurrence behaviour of individual states. A metastable state behaves as a recurrent state on short time scales and as a transient state on long time scales. Other manifestations of metastability include a short transition period on the critical time scale and approximate exponentiality of the distribution of the transition time.
More specifically, when τ = τ (λ) is a non-negative real-valued function, we say that a state a ∈ X is transient at time scale τ (or τ -transient, for short) when G τ (a, a) ≺ τ as λ → ∞ and recurrent at time scale τ (or τ -recurrent ) when G τ (a, a) τ as λ → ∞. In intuitive terms, state a is τ -recurrent if the Markov chain starting from a spends, on average, a non-negligible fraction of the time interval [0, τ ) at a, and is τ -transient otherwise.
In the reversible setting, there is a more convenient way to characterise recurrence and transience on a time scale, namely, in terms of escape times. For a ∈ X , define J(a) {x = a : π(x) π(a) as λ → ∞} , Thus, J(a) is the set of states whose stationary probabilities are asymptotically not negligible compared to a, and J − (a) consists of those states whose stationary probabilities are asymptotically larger than the stationary probability of a. Whether a is τ -transient or not depends on whether the chain has sufficient time to reach J − (a) or not: once the chain is in J − (a), it will spend only a negligible portion of its time in a. We refer to the time taken to go from a to J − (a) as the escape time from a. The proof of the following proposition is given in Appendix C. 4. It follows from Proposition B.1 that τ -transience is monotone in τ : if a state is transient at a time scale τ , then it is also transient at any time scale τ τ . In particular, the recurrence behaviour of every state a undergoes a transition at the time scale τ a E a [T J − (a) ]: the state a is recurrent at any time scale τ τ a (a short time scale) and transient at any time scale τ τ a (a long time scale). We call this the metastability transition of state a. We refer to a state a as a metastable state when its metastability transition is non-trivial, i.e., when J − (a) = ∅ and τ a → ∞ as λ → ∞. Note that if J − (a) is empty, then the critical scale τ a is ∞ (a is recurrent at any scale). Hence, in this case we call a a stable state.
Our main objective is to derive a sharp asymptotics for the mean and the distribution of the escape time T J − (a) , and to provide some information (albeit partial) about the typical escape trajectories. In case of the hard-core dynamics on a bipartite graph (satisfying certain conditions) we will provide such a description for the state in which the weak part of the graph U is covered with particles. This state turns out to be the "most stable" metastable state, i.e., the metastable state with the largest metastability scale. The transition from this metastable state to the stable state requires the formation of critical droplets whose size and shape are characterised by the solutions of an isoperimetric problem.

B.2 Mean escape time and transition duration
The proofs of the following two propositions are given in Appendix C. 5. The mean escape time from a metastable state has the following rough asymptotics in terms of critical resistance.

Proposition B.2 (Mean escape time: rough estimate). For every
π(a)Ψ(a, J − (a)) as λ → ∞. In conjunction with a good estimate on effective resistance, the above two propositions can often be used to give a sharp asymptotic estimate (with a precise pre-factor) for the escape time from a metastable state. Indeed, suppose we know that, for every x ∈ J(a) \ J − (a), the critical resistance Ψ(x, J − (x)) is asymptotically smaller than the critical resistance Ψ(a, J(a)). Then Propositions B. 3

and B.2 immediately give
We state this observation as the following corollary, which is proved in Appendix C. 5.
We say that a set Z ⊆ X is upward closed if y ∈ Z whenever π(y) π(x) for some x ∈ Z. In the following we may for instance set Z = J − (a) or Z = {v}, where v is the unique stable state.
is the duration of the transition from a to Z. Note that, by the Markov property and time-homogeneity, P a (T Z − T (N Z ) a ∈ ·) = P a (T Z ∈ · | T Z < T + a ). The following corollary is proved in Appendix C. 6.

B.3 Exponential law for escape times
If a is a metastable state (i.e., J − (a) = ∅ and π(a)Ψ(a, J − (a)) is large), then it can take a long time for the chain to pass from a to J − (a). Starting from a, the chain is much more likely to return back to a quickly than to pass through the bottleneck between a and J − (a). Each time the chain returns to a, the process starts afresh. The transition thus requires many repeated trials, each with a small success probability.
The hitting time of a rare event in a regenerative process approximately follows an exponential law (Keilson [37, Section 8]). The following proposition formulates a version of this phenomenon. See Appendix C.7 for its proof.
Proposition B.6 (Exponential law for regenerative processes). Let δT be a positive random variable with finite mean and B a Bernoulli random variable with success probability ε > 0. Let (δT k , B k ) k∈Z + be a sequence of independent copies of the pair (δT, B). Define the associated renewal process by T 0 0 and T k T k−1 + δT k for k ≥ 1.
An immediate consequence is the approximate exponential distribution for the escape time from a metastable state, stated in the following corollary. See Appendix C.7 for its proof.

B.4 Asymptotics for tail probabilities
In the previous section, we saw that the tail probability of the escape time from a metastable state is asymptotically exponentially small, namely, P a T Z > t E a [T Z ] = e −t [1 + o(1)] as λ → ∞. In this section, we derive similar exponential upper bounds for the tail probabilities and conditional tail probabilities of more general hitting times using rougher but more flexible regeneration arguments. Such exponential upper bounds are one of the ingredients of the path-wise approach to metastability (see e.g. the paper by Manzo, Nardi, Olivieri and Scoppola [42]). The material of this section is not used in the rest of the current paper but will be needed in follow-up work.
Recall from Proposition B.2 that E a [T J − (a) ] π(a)Ψ(a, J − (a)) for each a ∈ X . For A ⊆ X , define The following proposition is a variant of Theorem 3.1 in [42]. Its proof can be found in Appendix C. 7.
Proposition B.8 (Tail probabilities of exit time). Let A ⊆ X be an arbitrary nonempty set of states. There is a constant α < 1 such that, for every function ρ = ρ(λ) 1, Examples of useful choices for ρ are ρ λ δ (for a small constant δ > 0) and ρ log λ.
The above proposition can be used to bound the tail and expected value of the exit time of a set A conditioned on hitting a certain subset of ∂A upon exit. Set (a, b) : a, b ∈ X , K(a, b) > 0} .
Proposition B.10 (Conditional mean exit time). Let A ⊆ X be an arbitrary set of states. Consider an arbitrary partitioning of ∂A into two non-empty sets B 1 and B 2 .
There is a constant α < 1 (the one in Proposition B.8) such that, for every function ρ = ρ(λ) 1, The proofs can be found in Appendix C.7.

B.5 Sharp asymptotics for effective resistance
As we saw earlier, a sharp estimate on the mean escape time requires a sharp estimate on effective resistance. Sharp asymptotics for effective resistance between two sets can be obtained through a detailed understanding of the bottleneck between them. The bottleneck between two sets is often described by a notion of critical gate, which pinpoints the critical transitions in a typical passage from one set to another. The notion of critical gate used below is not as general as it seems. For instance, it is not directly applicable to Glauber dynamics for the Ising model, but it suffices for our hard-core model. Let A, B ⊆ X be two disjoint non-empty sets. We call a pair of disjoint sets Q, Q * ⊆ X a critical pair between A and B when (see Fig. 14 By an optimal path from A to B, we mean a path whose critical resistance is of the same order as Ψ (A, B), i.e., a path ω : A B with r(ω) sup e∈ω r(e) Ψ(A, B) as λ → ∞. Observe that an optimal path A Q does not pass through Q * , and an optimal path Q * B does not pass through Q. If  x ∈ X : there exists a path ω : A x not passing Q * such that Ψ(ω) Ψ (A, B) , which we think of as the set of states "behind the critical gate". We have used the notation Ψ(ω) sup e∈ω r(e) for the critical resistance of the path ω. Note that Ψ(ω) r(ω). The following proposition is proved in Appendix C. 8. In general, a critical gate between two sets A and B (as defined above) may or may not exist. Even when it exists, identifying a critical gate may require painstaking combinatorial analysis. However, once available, a critical gate provides a sharp estimate on the effective resistance between A and B. The following proposition is proved in Appendix C. 8.

B.6 Passage through the bottleneck
Let a be an arbitrary state and B a set not containing a. If a critical gate between a and B exists, then the passage from a to B is almost surely through the critical gate.
The following proposition is proved in Appendix C. 8.

Proposition B.13 (Critical gate is bottleneck).
Suppose that (Q, Q * ) is a critical pair between a and B, and S S(a, Q, Q * , B) the set of states behind the critical gate. As λ → ∞,

C.1 Nash-Williams inequality
Proof of Proposition A. 1 Write dW (x, y) W (y) − W (x) for the relative voltage of two nodes. By the conservation of energy (also known as the adjointness of θ → div θ and f → df ; see e.g. Lyons and Peres [41], Section 2.4), we have c(e) (dW (e)) 2 .

(C.2)
By the Cauchy-Schwartz inequality, for each k we can write The claim follows.

C.2 Effective resistance versus critical resistance
Proof of Proposition A. 2

C.3 Estimates on voltage
Proof of Proposition A. 3. By the short-circuit principle, we may assume that A and B are singletons, i.e., A = {a} and B = {b} for some nodes a and b. We have where the last equality uses the reciprocity equality in (A.11). The other inequality follows symmetrically, by noting that W A, Proof of Proposition A. 5. For brevity, we write W (x) instead of W A,B (x). If x, y ∈ A ∪ B, then the claim is trivial. If x or y is in A ∪ B and the other is not, then the conclusion follows directly from Proposition A. 4. So, assume that x, y ∈ X \ (A ∪ B). We verify that The opposite inequality follows by symmetry.
The first term can be estimated as Again, the first term reduces to Similarly, for the second term, we get (C.14)
Conversely, assume that G τ (a, a) ≺ τ . Let A ⊆ X be the set of states that can be reached from a without passing through J − (a). Note that π(b) π(a) for each b ∈ A.
Therefore, by the reciprocity identify in (A.10), we have that On the other hand, we note that

C.5 Mean escape time
Proof of Proposition B.2. We know from (A.7) that for every x ∈ X . For x / ∈ J(a) ∪ {a}, we have, by definition, that π(x) ≺ π(a) as λ → ∞, whereas for x ∈ J(a) \ J − (a), we have π(x) π(a). Therefore Since the chain is finite, we must have ∅ = Z n ⊆ Z for some n. By conditioning, we have
To estimate H(θ/M ), we note that, conditional on B = 1, δT /M is a positive random variable whose expected value η M tends to 0. Therefore δT /M converges in distribution to a unit mass at 0, and hence H(θ/M ) = 1 + o(1).
For G(θ/M ), we need a more accurate estimate. We note that, conditional on B = 0, δT is a positive random variable with mean 1. Therefore G(θ) is continuously differentiable with G (0) = i, and a Taylor approximation gives G(θ) = 1 + iθ + o(θ) as θ → 0. It follows that, for each θ ∈ R, G(θ/M ) = 1 + i 1 Altogether, for each θ ∈ R, we get 38) as λ → ∞. Therefore T N /M converges in distribution to an exponential random variable with rate 1. Finally, since the exponential distribution t → e −t is continuous, the  Proof of Proposition B. 9. For x ∈ A, we can write By Proposition B.8, we have P x T ∂A > ρΓ(A) α ρ for some constant α > 0 independent of ρ. Let w be a simple path from x to B 1 that does not pass through ∂A. The length of w is at most |A| and so P x (T B1 < T B2 ) ≥ P x (X follows w) ≥ κ |A| , with X = (X(n)) n∈N the discrete-time chain defined in Section 1.2. The claim therefore follows.
Proof of Proposition B. 10. We have 45) Using the bound in Proposition B.9 iteratively, we get, via the Markov property and time homogeneity, that which gives as λ → ∞.

C.8 Critical gate
Proof of Proposition B. 11. Suppose that r(x, y) Ψ (A, B). Since x ∈ S, there is a path A x with Ψ(A x) Ψ(A, B) that does not pass Q * . Continuing this path with the transition x → y, we obtain another path A x → y with Ψ(A x → y) Ψ(A, B) that does not hit Q * , except possibly at y. But, since y is assumed to be outside S, it must be in Q * . By assumption, Ψ(y, B) ≺ Ψ (A, B), which means that there is a path y B with Ψ(y B) ≺ Ψ (A, B). Gluing this path with A x → y, we get an optimal path A x → y B, which, by definition, must pass through the critical gate. It follows that x must be in Q, because A x does not pass Q * and y B does not pass Q.
Next we verify the lower bound. For each x ∈ Q and y ∈ Q * with x ∼ y, let ω x,y be an optimal path A x → y B whose parts A x and y B are also optimal. Thus, the transition x → y is the unique transition on ω x,y whose resistance has the highest order of magnitude as λ → ∞. For each pair (a, b) with a ∼ b, let n(a, b) denote the number of pairs (x, y) such that ω x,y passes through a → b. By the extended dual Nash-Williams c(Q, Q * ). Therefore E a [N T B (x → y)] = o(1) as λ → ∞. That P a (T xy < T B ) = o(1) follows from the Markov inequality.
When following σ, the number of particles on V goes from |A 0 | to |A m |, each step having at most one more particle on V than the previous step. Therefore, there are configurations on σ that have exactly s particles on V . Let σ( ) be the first configuration on σ with s particles on V . Since s > |A 0 |, we have ≥ 1 and the transition σ( −1) → σ( ) is of the type +V (i.e., adding a particle on V ).
Proof of Proposition 3. 3. Since the graph is connected and V = ∅, the neighbourhood N (a) of every site a ∈ V is non-empty. The claim thus follows immediately from Lemmas 3.1 and 3.2.

C.10 No-trap condition via ordering
Proof of Proposition 3. 4. Consider x / ∈ {u, v}. Let i ∈ U and j ∈ V be two adjacent sites that are not occupied in x. Such sites exist. Indeed, N (U \ x U ) ⊆ x V , otherwise the graph would not be connected. By assumption, there is a standard path u = ω(0) → ω(1) → · · · → ω(m) ∈ J(u) whose first particle on V is on site j. Note that this path starts by removing particles from neighbours of j until it is possible to place a particle on site j. Since re-ordering the removal of these particles from U does not affect the condition of being a standard path, we may assume that the first particle to be removed is from site i.
Argument. Let s denote the number of particles that x ∧ ω(k) has on V . We consider three separate cases.
The configuration x ∧ ω(k) has no particle on V . Moreover, the choice of ω ensures that x ∧ ω(k) has no particle on site i ∈ U . It immediately follows that π(x ∧ ω(k)) ≺ π(u).
(C.76) (For the latter inequality, recall that ω(k 1 ) / ∈ J(u).) Case 3: s = |ω V (m)|. This is impossible. Indeed, every particle that x ∧ ω(k) has on V is also present in ω(m). But, by the choice ω, ω(m) has a particle on site j ∈ V on which x has no particle. Therefore x ∧ ω(k) has strictly less particles on V than ω(m).
This concludes the proof.

C.11 Passing the bottleneck
Proof of Lemma 3.7. Suppose that x is a basic configuration in ω with |x V | = s particles on V . By the remark before the lemma, we have Writing ∆(x) = ∆(s) + ∆(x) − ∆(s), ∆(s) = ∆(s * ) + d∆(s) and s = s * + ds, and using the assumption we obtain On the other hand, since ω is optimal, we know by Lemma 3.3 that x is ε-optimal. To see the latter claim, note that any configuration that is ε-optimal for some ε < 1 is, in fact, optimal.
Proof of Lemma 3. 9. Let ω(i) be the next basic configuration after ω(p). Since ω(p) is the last basic configuration before ω(q) having less than s * − 1 particles on V , ω(i) must have s * − 1 particles on V . We have either of the following two possibilities when going from ω(p) to ω(i) on ω: Suppose that ω(i) has t = t * + dt particles on U . Then ω(p) has at most t + 1 particles on U . By the remark before Lemma 3.7, the critical resistance of ω satisfies as λ → ∞. But, since ω is optimal, we know from Lemma 3.3 that Combining (C.82) and (C.83), it follows that dt ≥ 1 + α > 1. Since dt is an integer, we find that in fact dt ≥ 2, hence proving the first claim. In particular, Proof of Proposition 3. 10. Let ω be an optimal path from u to J(u). By definition, s * ≥ 1.
Let us first assume that s * ≥ 2. Let ω(q) be the first basic configuration in ω that has s * + κ particles on V . Let ω(p) (with p < q) be the last basic configuration before ω(q) with s * − 2 particles on V . Finally, let ω(r) (with p < r < q) be the last (not necessarily basic) configuration before ω(q) having s * − 1 particles on V . Set y ω(r), x ω(r − 1) and z ω(r + 1).
If s * = 1, we set t 0 and choose r (t ≤ r < q) to be the last configuration before ω(q) having no particle on V . Note that r > t for otherwise the graph will not be connected. In this case, ω(t) = u is optimal and the rest of the argument goes without change.
Let x ∈ Q and y ∈ Q * be such that x −U − − → y. By definition, there exist A ∈ A and B ∈ B such that y V = A and y U = U \ N (B). Let i be the unique element of B \ A.
Then, x V = y V and x U = y U ∪ {j 0 } for some j 0 ∈ N (i) \ N (A). (Note that N (i) \ N (A) is non-empty. Otherwise, ∆(B) = ∆(A) − 1, which gives g(s * − 1) > g(s * ). The latter inequality clearly cannot happen if s * > 1. On the other hand, when s * = 1, the set N (i) \ N (A) = N (i) cannot be empty, because the graph is assumed to be connected.) Let j 1 , j 2 , . . . , j d be an enumeration of N (i) \ {j 0 }.
According to (H4.b), there is an isoperimetric progression from ∅ to A, consisting only of sets of size at most s * − 1. Let ω be the path associated to such a progression.
We next argue that A is in fact connected in L. Indeed, suppose that A is not connected. Let A 1 , A 2 , . . . , A k be the connected components of A. Since A is convex and A ∪ N (A) is connected in the original lattice, we can re-order the sets A 1 , A 2 , . . . , A k in such a way that the two sets N (A 1 ∪ · · · ∪ A k−1 ) and N (k) share exactly one element (Fig. 16). However, since A is convex, we can shift A k to obtain a set A k disjoint from A 1 ∪ · · · ∪ A k−1 such that N (A k ) and N (A 1 ∪ · · · ∪ A k−1 ) share at least two elements. It follows that ∆(A 1 ∪ · · · ∪ A k−1 ∪ A 2 ) < ∆(A), which is a contradiction.   A convex and connected set in L is easily seen to satisfy N 1010 (A) = ∅. Let L be the graph with vertex set U and with an edge between (a, b) and (a , b ) if and only if |a − a| = |b − b| = 1. This is the lattice dual to L. Since A is connected and convex, the elements of N 1 (A) ∪ N 2 (A) ∪ N 3 (A) induce a simple cycle in L , which is the contour encompassing A. We denote this cycle by c(A). Since A is convex and connected, there is a natural one-to-one correspondence between the edges of c(A) and the edges of the EJP 23 (2018), paper 97. contour c(A) encompassing the rectangle A. Let us label the vertices of c(A) with pairs in {1, 2, 3} 2 as follows (see Fig. 17). Let x be a vertex of c(A), and let (y, x) and (x, z) be two two edges incident to x. Let (y , x ) and (x , z ) be the edges of c(A) corresponding to (y, x) and (x, z), respectively. If x ∈ N i (A) and x ∈ N j (A), then we label x with (i, j). Note that the only possible labels are (1, 1), (2,2), (3,1) and (1,3), and that the four corners of c(A) are precisely the vertices with label (1, 1). Counting  The sets L(d, r, k) satisfy the recursion L(d, r, k) can be divided into those elements w with w = r and those with w = r + 1. The first part is simply the set B 1 (d, r, k) {w ∈ {0, 1} d : w = r} \ L(d, r, k) and has cardinality d r − k. The second part is B 2 (d, r, k) N (L(d, r, k)) ∩ {w ∈ {0, 1} d : w = r + 1}. The elements of B 2 (d, r, k) are the words obtained from the elements of L(d, r, k) by turning a 0 into a 1. Denoting the cardinality of B 2 (d, r, k) by ψ d (r, k), the recursion (5.8) follows easily from (C.94).