The sharp interface limit of an Ising game

The Ising model of statistical physics has served as a keystone example of phase transitions, thermodynamic limits, scaling laws, and many other phenomena and mathematical methods. We introduce and explore an Ising game, a variant of the Ising model that features competing agents influencing the behavior of the spins. With long-range interactions, we consider a mean-field limit resulting in a nonlocal potential game at the mesoscopic scale. This game exhibits a phase transition and multiple constant Nash-equilibria in the supercritical regime. Our analysis focuses on a sharp interface limit for which potential minimizing solutions to the Ising game concentrate on two of the constant Nash-equilibria. We show that the mesoscopic problem can be recast as a mixed local/nonlocal space-time Allen-Cahn type minimization problem. We prove, using a $\Gamma$-convergence argument, that the limiting interface minimizes a space-time anisotropic perimeter type energy functional. This macroscopic scale problem could also be viewed as a problem of optimal control of interface motion. Sharp interface limits of Allen-Cahn type functionals have been well studied. We build on that literature with new techniques to handle a mixture of local derivative terms and nonlocal interactions. The boundary conditions imposed by the game theoretic considerations also appear as novel terms and require special treatment.


Introduction
This article develops an Ising game as a prototypical example for spin games and the phenomenon of phase transitions in mean-field games.Game theoretic models incorporate rational agent behavior, introducing additional complexity to the particle models of statistical physics.These models are suited to applications including social dynamics, economics, and neural networks.
We begin with first introducing our framework.To put our work into context, we survey a series of results in the literature, including the passage from the discrete spin games to continuous mean-field games.Our main result, stated in Section 1.4, focuses on a mesoscopic to macroscopic scaling limit for the Ising game.
1.1.Motivation.In economics, the study of games has been used to form insights into phenomena that arise when the players exhibit free will and decision-making in their choice of actions.Beyond the original applications to economics and finance [45], game theoretic models have been used in evolutionary biology [32] and opinion dynamics [24].Along with games, one can consider distributed optimization problems -where many individual agents take actions with a collective objective -for example arising from the management of a smart energy grid [46] or training weights of a neural network [43].
Phase transitions have been proposed to be important phenomena in understanding biological systems [11], [38], neural dynamics [28], [13], and social behavior [44].Many frameworks exist to model such systems.In this work, we consider an intersection between the frameworks of dynamic games and spin systems that allows for both concrete calculations and general mathematical analysis.
There are many interesting aspects that arise in phase transitions, including mesoscopic and macroscopic scaling limits and interface dynamics [16], fluctuations in the mesoscopic limit and universality classes [36], and spontaneous phase separation [17].All of these aspects have been studied at length for many different particle models.In this work, we first briefly review the mesoscopic limit, which has been studied extensively in the context of mean-field games.We then focus our technical analysis on the macroscopic limit and interface dynamics, where we find novel features that require new techniques.
1.2.Spin games.Spin systems arise in the analysis of the magnetization of solid-state materials where the spin represents the magnetic moment of a particle.Another common application of spin systems is that of the grand canonical ensemble of particles interacting as a fluid.In these models, the spin is interpreted as a discretization of the particle density.In this way, spin systems can be used generally as a discretization of models with continuous state variables.Spin systems have been considered in connection with the meanfield behavior of populations in [29], [30], and [14].An Ising game with discrete player actions was studied in [35] played on graphs, where a dynamic evolution was considered that behaves similarly to the Ising model in the mean-field limit.
We combine the concept of spin systems with a multiplayer game, where each agent controls their own spin in an optimal manner.The Ising game is a prototypical example of such a spin game that mixes a discrete state variable with a continuous state position.A fundamental distinction with models of statistical physics (as well as the evolution considered in [35]) is that players will look ahead, and their prediction of the future influences their control decisions.The spin game models provide an ideal environment to study the phenomenon of phase transitions with the addition of rational behavior.
The combination of discrete and continuous state variables has yet to be considered in the literature on mean-field games, which extensively covers games with either discrete or continuous state variables.A general treatment of finite-state mean-field games is given in [10] and [26].Phase transitions were observed in [33] as a bifurcation of the ergodic mean-field game system.The solution to the master equation is analyzed in [4] and [12].Phase transitions in continuous space mean-field games have been studied using bifurcation analysis in [27].Further analysis of the fluctuations about equilibria was undertaken in [12].
We find that the Ising game with nonlocal interactions undergoes a phase transition as the strength of player interactions changes.When the interaction strength is small, the players do not deviate from a 'rest' behavior that results in independent spins with zero mean.Above a critical interaction strength, the players will instead exert their control to align more closely with their neighbors, resulting in a nonzero mean.
The phase transition corresponds to a bifurcation of solutions to the mean-field game system.When we include the nonlocal interactions of continuous spatial variables, we find new dynamics that we can best understand by considering a macroscopic limit.
1.3.Macroscopic Limit.The literature on the Ising model and other macroscopic limits of phase transition models is vast.For a treatment of the analogous results in the Ising model, see [16], [5], [3].Novel mathematical tools were developed in [21], [31].Our approach is more akin to the work on the Van der Waals -Allen -Cahn -Hilliard model of gradient phase transitions [37].Additional tools for similar models are developed in [6], [1], [15], [40].
We establish a surprising equivalence between the mesoscopic spin field optimal control problem and a mixture of the local and nonlocal Van der Waals -Allen -Cahn -Hilliard models.The resulting macroscale interface minimizes a cost functional, which takes the form of an anisotropic space-time area.Alternatively, the resulting interface evolves in time with controlled propagation speed.Closely related macroscopic models of distributed optimal control are considered in [8], [7].
While we use many tools from the related phase transition models, combining them in this new way introduces many novel aspects of the analysis.The boundary conditions imposed by the game theoretic considerations also appear as novel terms and require special treatment.1.4.Main result, outline, and open questions.Our contributions consist of both the introduction and the analysis of spin game models.More precisely, we start with introducing and motivating a class of spin game models at the microscopic, mesoscopic, and macroscopic scales.Next, we focus on the mesoscopic to macroscopic scaling limit.We reduce the study of equilibria in the Ising game to critical points of an energy functional that combines a kinetic energy term, a double-well potential, and a nonlocal energy in the spatial directions.We introduce an analysis of the "effective surface tension" and initial and terminal time boundary layer costs associated with this model.Finally, we use new analytical techniques, within the established context of the sharp interface limit theory of phase field models, to handle the mix of local and nonlocal interaction costs and the additional boundary layer terms.
After expressing a more general framework, our analysis focuses on a more specific Ising game, which consists of the following elements: • A spin field on the space-time domain, s : [0, T ] × T d → [−1 , 1], that represents the mean spin.The spin field is determined by selecting control policies, a ± : [0, T ] × T d → R + , that represent the rates of flipping from +1 to −1 and from −1 to +1.The evolution equation for the spin field is given by where the small parameter λ > 0 corresponds to a mesoscopic scale.• A Lagrangian function, L : [0, 1] × (R + ) 2 → R, of the local mean spin and controls, which is a local running cost density associated with a player controlling the rate at which their spin flips.We work with the form that closely resembles an entropic term in the Ising model.The parameter β −1 > 0 has an interpretation as a cost coefficient and appears analogous to the temperature.The convex Lagrangian L enforces that the flipping rates are positive and encourages a ± to coincide at a neutral value.Consequently, L encourages the mean spin s to rest at zero.The derivation of this form of Lagrangian from a microscopic model is covered in Section 2 for further motivation.• An interaction running potential cost density of the form where J λ (z) = λ −d J(λ −1 z) is a nonnegative, rescaled interaction kernel that encourages players to align their spins with their neighbors at a length scale of λ.The strength of the interaction is given by Ĵ = R d J(x)dx.As explained in Section 2, minimizing the total potential cost (C λ below) corresponds to Nash equilibrium strategies.The competition of the Lagrangian and the interaction energy results in phase transition where, when β Ĵ > 1, players prefer to organize themselves at a constant Nash equilibrium with mean spin s > 0 or −s.We denote the corresponding running cost density of the constant Nash equilibria by Λ.
• An initial spin configuration s 0 : T d → [−1, 1], and a terminal cost of the form These initial and terminal conditions cause solutions to deviate from the constant Nash equilibria.Our specific problem is now to minimize and to study the sharp interface limit λ → 0 of the averaged rescaled cost (in the macroscopic (τ, z) coordinates), under the constraint (1.1) and the initial data s(•, 0) = s 0 , As λ → 0, the mean spin concentrates on the set {−s, s}, except on an interface Σ and the boundary layers at τ = 0 and τ = T .The contribution of the asymptotic behavior at the mesoscopic λ-scale of C λ allows us to characterize the interface between the equilibrium states as a local minimizer of the macroscopic energy, as we will see in the context of the Γ-convergence.
Our analysis begins with the illuminating observation that the cost can be decomposed, up to a total derivative, into the sum of a double-well potential, a Dirichlet-like strictly convex function of ∂ τ s, and a nonlocal interaction cost.The integrand of the space-time integral of (see Corollary 3.4).Using this decomposition, we show that the rescaled spin variable converges to one of the stable equilibrium states −s or s, with a transition interface Σ in between the states.Moreover, we show that the cost in the macroscopic limit, in Γ-convergence sense, is given by the sum of initial Our main result, which is stated in full below in Theorem 4.1, is summarized by: Theorem 1.1.The mesoscopic scale cost functionals C λ converge as λ → 0, in the sense of Γ-convergence on appropriate spaces, to the macroscopic scale cost among BV functions s : T d → {±s} with Σ = Σ(s) as the discontinuity set, and ν = ν(s) as the measuretheoretic normal vector field on Σ, pointing toward the s-region.
In particular, sequences of minimizers for C λ are precompact in L 1 and any subsequential limit as λ → 0 minimizes V .
See Figure 1 for an illustration of a macroscopic limit s.The initial and terminal time costs can be represented by a one-dimensional problem, showing that the solutions are locally constant in space near the boundary layer.A remarkable relation appears between the initial and end times that (see Remark 3.9 below).The Φ term contains all of the asymmetry between the initial and final time as the remaining terms in the decomposition of Corollary 3.4 obey a time-reversal invariance as s(τ, z) goes to s(−τ, z).This relation also shows that without an end cost g it is advantageous to relax back closer to 0 from the equilibria ±s near the terminal time.This feature is intimately connected with the forward-looking nature of control problems: the agents anticipate the end time and turn off their controls to save costs.
The key element of the analysis is various quantitative versions of patching estimates, which we use to localize the profile of the spin variable near the transition interface.The other essential ingredient is the elegant idea introduced in [1] to study a nonlocal interaction energy, where the patching lemmas are applied to polyhedral regions to construct a recovery sequence for the Γ convergence of the cost.While this approach does not readily yield quantitative error estimates, for our purpose it provides a relatively simple alternative of perturbing smooth surfaces to compare it with hyperplanes.
There are interesting open questions that arise from our analysis.For instance, we can question the shape of the minimizer for V as well as the regularity or geometry of the interface.We suspect that our limit Lagrangian L is at least continuous with respect to ν when the interaction kernel J is isotropic, but it is not easy to check this due to the anisotropy created by the time variable.Even with a regular L, the shape of the minimizing interface is not necessarily regular in higher dimensions, as we see from [41], [39].It is also natural to ask whether L can be obtained by only considering planar traveling wave solutions.This is true in the case of the nonlocal interaction energy studied in [1], see [2].We answer this question positively for the boundary layer costs, but it remains open for the interfacial cost.
Another natural question is on the asymptotic behavior of phase transition near s = 0 when the "inverse temperature" β approaches the critical value where the local parts of the cost dominate and the double-well structure disappears via ±s converging to zero.
The boundary layer terms appearing in the macroscopic cost is a novel feature in our problem that merits further study: for instance, there is an apparent symmetry between the initial and terminal cost (see Remark 3.9 and (1.2)).
While we specialize our model with a specific Lagrangian, for which calculation is convenient, we expect that our analysis extends directly to a more general class.For example, we can consider the Lagrangian functions that have the form where l(a) is convex and satisfies l (a) → −∞ as a → + 0 and l (a) → +∞ as a → +∞.
A more interesting and difficult open question is whether one can obtain similar results for nearestneighbor interaction costs, analogous to what was achieved for the three-dimensional nearest-neighbor Ising model in [5].

Spin games
In this section, we introduce and provide a non-rigorous exposition on spin games: N -player games, mean-field control, and mean-field games.In the length scale spectrum considered in this work, the Nplayer game is a microscopic model, and the mean-field control and game problems are mesoscopic models.The derivations discussed in this section are meant as motivation and contextualization for the rigorous mathematical work which we conduct later in the paper which considers a mesoscopic to macroscopic limit.
2.1.N-player spin games.We consider N players with fixed positions on a uniform square lattice, x N,i ∈ T d .The collection of all positions is denoted as x N ∈ T dN .Each player has a discrete spin state σ N,i ∈ S = {−1, 1}, and the player controls the rate at which their spin flips according to the control A N,i t ∈ R + .We denote the collection of all spins as σ N ∈ S N and of all controls as A N t ∈ (R + ) N .When determining their optimal strategy, each player may consider the states of all other players, which we encode into the empirical spin measure m σ N ,x N ∈ M(T d ), the space of finite variation signed measures on Denote P(S N ) the space of probability measures on spin configurations.We will consider the evolution of state distributions µ N t ∈ P(S N ).To ease the notation we drop N when it can be inferred from the context.The problem consists of specifying the following: • A Lagrangian function on the control space, l : R + → R, which is the cost associated with a player flipping their spin.• An individual running player cost on the state and empirical measure space, f : We also consider the case of a global running cost that is a function only of the empirical measure f : M(T d ) → R.
• A terminal cost on the state and empirical measure space, g : S × T d × M(T d ) → R, and the analogous case of global terminal cost ḡ : M(T d ) → R. • An initial distribution of states µ N 0 ∈ P(S N ).E.g., σ i are independent with mean s 0 (x i ) for s 0 : An important aspect of game theoretic problems is the information available to the players.We work here assuming full information, i.e., closed loop, where each player may choose their control as a function of the state of all the other players (t, σ) → A i t (σ).Given A, we define the joint distribution µ A t ∈ P(S N ) as the joint distribution of all players with spin i flipping at rate where t i σ denotes the collections of spins with the ith component flipped to be −σ i .
Global control problem.We define the global cost to be and the control problem is inf This has the form of either a standard optimal control problem with states in P(S N ) or as a continuous time Markov decision process with discrete states in S N .We let V N : [0, T ] × S N → R be the value function that solves where h is the Legendre transform of a → l(a) and the discrete finite gradient is . By standard theory, the optimal control is then given by N -player game.We define the individual costs to be We consider the differential game played by the N players.A Nash equilibrium is collection of controls A such that for each i we have We look for coupled solutions v N,i t with 2.2.Mean-field spin games.Since the dependence of each player's costs on the other players is only in terms of the empirical spin measure, one expects the system to limit to a mean-field game as N → ∞.Specifically, the random empirical spin measures, m σ,x , concentrate on a flow of deterministic spin fields s(t, x) corresponding to the mean of σ i for x i near x.We follow this concept in order to, non-rigorously, derive the corresponding mean-field game system in the infinite-player limit.
For the mean-field version, we consider control policies a ± (t, x).We work in terms of the spin field s(t, x), which represents the average state of the players near x at time t.We note that the density of players in state ±1 can be recovered as 1±s(t,x)

2
. The evolution of the spin field is given by We assume that f (σ, x, m) = − f (−σ, x, m), and when s(x)dx = m(dx) we write f (x, s) = f (+1, x, m).When m σ,x concentrates at s(t, x) under the probability measure µ t (σ), we have and we do the same for g and g.We also abuse notation slightly, to write f (s) = f (m) when s(x)dx = m(dx).The state space of spin fields s(t, •) is denoted by X which is the unit ball in L ∞ (T d ).
The global cost is given by Mean-field global control.The global optimal control problem is inf The value function is defined The McKean-Vlasov equation (2.2) can be expressed as, for all

Survey of results
. We now list a few standard results, which are common in either finite-state mean-field games or continuous mean-field games [10], [26].The proofs can all be adapted to spin games.Potential games.A potential game occurs when the costs, f (x, s) and g(x, s) are derived from potential costs as f (x, s) = D f (s)(x) and g(x, s) = Dḡ(s)(x).In this case, the Nash equilibria for the mean-field game correspond to critical points of the global control problem.
Proof.Equation (2.5) is obtained by differentiating (2.4) with respect to the field argument, s.In particular, we have Dv(x, s)(y)dy.
Mean-field Nash System.In either the game or global case we have the mean-field Nash system, which is With s(0, x) = s 0 (x) and p(T, x) = −g x, s(T, •) .
Monotonicity.If f and g are monotone, i.e., then the solution to (2.6) is unique.We will be interested in the phenomena that arise without this property.
Proof.The proof follows exactly the same idea as the monotonicity argument for continuous games as in, for example, Proposition 3.2 of [9], by showing that the quantity decreases along the flow, and is nonnegative at the end-time due to the monotonicity of g.
Convergence of N -player games.Solutions of the N -player game / global control problem converge to the solutions of the mean-field game / global control problem when the interactions are in a mean-field form.Without the uniqueness of solutions, it is often necessary to consider a weaker randomized notion of solutions.On the other hand, the solution to the master equation (2.5) constructs approximate solutions to the N -player problem.Results on the convergence of continuous games can be found in [9] and [34].The convergence problem for finite state games has been analyzed in [12] and [26].The master equation has also been used to determine the fluctuations about the mean and a large deviations principle [12], [18].We expect the results for the framework of spin games to follow the identical trends from these works, although we do not pursue them in detail here.
A remarkable aspect of these results is that the resulting mean-field game system (2.6) does not depend on the information structure of the players, or even whether the problem originated as a game or as a global distributed optimal control problem.We focus on this system as the starting point for our macroscopic convergence analysis.

Ising game and Macroscopic limit
We now specify a problem formulation for which we consider in depth the question of a macroscopic scaling limit.The problem, modeled after the statistical Ising model, exhibits a phase transition, where in the 'ordered' phase two stable stationary equilibria solutions are present.In the macroscopic limit, all equilibria will concentrate on these two solutions except on a codimension-one interface in space-time.We consider only the case of a potential game, in which case the global optimizers correspond to Nash equilibria.In contrast with the Ising model, the interface is 'controlled', to minimize an inhomogeneous space-time surface area, which can also be viewed as a minimization of the speed of propagation along the front.See the discussion at the end of Section 3.4.
We work on the 'mesoscopic' domain [0, λ −1 T ] × λ −1 T d , where λ −1 T d is the d-dimensional torus of width λ −1 that can be associated with [0, λ −1 ] d ⊂ R d .We recall that in the discussion in Section 2 we have already passed from a 'microscopic' scale, which appears in the mesoscopic scale as a length scale of order λ −1 N −1/d (so we are effectively considering that N >> λ −d >> 1).
The interaction will be determined by a kernel satisfying the following assumptions: (A1) J : R d → R is non-negative and has finite total mass The interaction acts at a distance of order one on the mesoscopic scale where we use (t, x), which will appear as a distance of order λ on the macroscopic scale where we use (τ, z) = (λ t, λ x).We consider the convolution on a torus, for where we have used the periodic extension of η to R d .
3.1.Problem statement and macroscopic scaling.We rescale the cost by subtracting the cost of the stationary equilibria, Λ, and multiplying by λ d to capture the costs on a co-dimension one region.We consider the asymptotics as λ → + 0 of the problem to minimize where we define, inspired from the microscopic problem with l(a The Lagrangian incentivizes the neutral strategy, a ± = 1, where the spin switching rate is always 1.With this strategy, the mean spin would lie at rest at s = 0.The parameter β has the same effect as the inverse temperature in the Ising model, although the interpretation here is a control penalty and not inherently statistical.In the same fashion as the Ising model, the interaction incentivizes agreeing with nearby spins when Ĵ > 0.
The constant, Λ that corresponds to the cost of the stationary equilibria, is given by Remark 3.1.We assume that g does not depend on the spin field for simplicity, although our techniques would allow such dependence.In particular, it is natural to allow g to depend on s(x) locally, which is slightly different from the terminal cost in Section 2, where it was assumed that g was defined over the empirical measures.This local form meshes naturally with our macroscopic analysis and could be the limiting result of a slightly more complicated microscopic problem.
We introduce the costate p(t, x), so that a ± maximizes the Hamiltonian, where the maximum occurs at The optimality equations, equivalent to (2.6), are Because we work directly with the energy, which we will view as a function of (s, ∂ t s), we mostly will not refer to (3.5) nor the costate p.
We note that the problem can also be posed in the macroscopic coordinates of (τ, z) := (λ x, λ t).
In the new variables the cost can be now written as and with the 'short' range interaction The corresponding optimality equations are We remark that, in this form, the system can be easily simulated using forward-backward iteration: see Figure 2 for some results from simulations.
In the following, we will primarily work in the macroscopic coordinates, and drop the hat from the notation for s, a, and p. 3.2.Alternate parameterization and appearance of a double-well potential.In order to pass to the macroscopic limit, it is helpful to decompose the cost functional (3.1) into terms that resemble more closely what has been studied in the literature.The interaction cost (3.3)will be split as a local term and a nonlocal gradient penalization.An identical nonlocal term has been studied in [1] alongside a doublewell potential, and we use this work as a primary guide for our analysis.The local part of interaction cost combines with the Lagrangian (3.2), and then further decomposes as a double-well potential and a local penalization of the time gradient that is similar to kinetic energy.The local terms closely relate to the gradient penalizations with double-well potentials that were studied in [37] and many other works, except that we only have the time gradient and no spatial derivatives.The Ising game can thus be seen as a mixture of the local and nonlocal phase transition models, which is local in the time component and nonlocal in the spatial component.This mixture introduces many new challenges.However, using the decomposition detailed below, many of the techniques of both the local and nonlocal theory can be adapted for our analysis.
We expand the interaction cost as The first term may now be combined with the local control cost.To put this into a more standard form, we express the local terms as a function of the spin field and velocity, (see Figure 3).These minimizers ±s correspond to the stationary equilibria of the Ising game.Recall that we have normalized by the stationary cost Λ so that W β (s) = W β (−s) = 0.The local energy W decomposes into the double-well potential and a convex 'Dirichlet'-like energy that is quadratic near v = 0 and grows superlinearly like |V | log |V | as |V | → ∞.An additional term Φ appears, which is a total time derivative and can be integrated out of the cost and incorporated into the boundary Plots of W β for Ĵ = 1 and for varying values of β −1 ∈ {.66, .9,1.1} crossing the critical value at β Ĵ = 1.conditions.Surprisingly, the total time derivative term encapsulates all of the time asymmetry of the problem.
In particular if we define with respect to 0, and has three critical points in (−1, 1) at ±s and 0 which are, respectively, nondegenerate local minima and a local maximum.In particular, we have the explicit coercivity with respect to the potential minima See Section 3.5 for the proof.
Remark 3.3.In view of the decomposition of Proposition 3.2 it is natural to consider the function space for the spin field s to be This is sufficient to make sense of the initial condition and terminal cost in the sense of L 1 trace.Due to the slightly stronger than L 1 growth of the time derivative energy, it is straightforward to obtain the existence of minimizers in this space.Also, since the spin field exists in (−1, 1) (which will soon be improved to (−s, s)) the functions are also L ∞ .Later, in Proposition 4.2, we find that asymptotically the energy also bounds a gradient in the spatial directions, making the natural space for the macroscopic fields s that of bounded variation functions on (0, T ) × T d .
Based on the decomposition, we now introduce a handful of localized quantities.We first localize the energy by defining, for We also denote just the local terms in the energy as When comparing localized energies, we must consider the locality defect, as in [1], corresponding to the discrepancy in nonlocal terms.For A, A ⊂ T d , When defining the macroscopic costs, it is useful to consider a cost where the nonlocal term is integrated over all of R d (where if s is defined on T d it can be extended periodically).That is Clearly, we have We also consider time-integrated versions of the above quantities.We will take the convention of naming the time-integrated energies with calligraphic font.If A ⊂ R d+1 we let A τ denote the time slices of A and τ 1 and τ 2 be the lower and upper bounds in time.Then In terms of these definitions, we can reformulate the cost only in terms of the spin field s.
Corollary 3.4.Assuming that the controls a ± are optimal given ∂ t s(t, x), we can write 3.3.Asymptotic heuristics and preliminary results.We assume that β Ĵ > 1, for which there are two stable long time equilibria, s , as shown in Proposition 3.2.Corresponding to each equilibrium there are unique controls given by the constrained minimization procedure of Proposition 3.2 at zero velocity, A ± (s, 0) and A ± (−s, 0).
The stable equilibria correspond to the leading asymptotic term of the cost that is canceled by the Λ in the definition (3.1) of C λ .
These results are summarized by the following proposition.
Proposition 3.5.Assume that J(x) ≥ 0 for all x ∈ R d and β Ĵ > 1.Then the constant solutions (s, A ± (s, 0)) and (−s, A ± (−s, 0)) are globally optimal in the sense that if s(0, x) = s(λ −1 T, x) = s for all x ∈ λ −1 T d , then C λ s, a ± ) ≥ C λ s, A ± (s, 0) , and the same holds with (s, A ± (s, 0)) replaced by (−s, A ± (−s, 0)).Equivalently, See Section 3.5 for the proof, which follows directly from Proposition 3.2.The spin fields may be restricted to take values in the interval [−s, s].We assume that the initial and final data s 0 and g respect this condition as well.This assumption is probably not truly required, since the solution should be approximately in the interval [−s, s] outside of some initial/final layers.Lemma 3.6.Assume that J(x) ≥ 0 for all x ∈ R d and β Ĵ > 1.In addition, suppose that s(0, z) ∈ [−s, s] for all z ∈ T d and that |g(z)| ≤ 1 2β Φ (s).Then the cut-off function We refer again to Section 3.5 for the proof of the above lemma.
3.4.Macroscopic Energy.Let s denote a macroscopic field defined in (0, T ) × T d , which takes values in the equilibria {−s, s} almost everywhere with a discontinuity along some d − 1 dimensional interface.In the next section, we will prove that for minimizers (s λ , a λ ± ) lim where V is the effective cost which we will make precise shortly.This result follows from a more general result in the framework of Γ-convergence, and helps to characterize asymptotically the minimizers (s λ , a λ ± ) in the sense that, passing to a subsequence, lim The time scale (in the macroscopic scale) of convergence to the equilibrium is O(λ), and the boundary layer terms at the initial and final times correspond to solutions to 'infinite time horizon' problems where the spatial interactions are small.There is also a boundary layer coming from the deviation from the long-time equilibria, {−s, s}, in a distance O(λ) from an interface of d − 1 dimensions, corresponding to solutions of a 'traveling wave'-type cell problem.See Figure 1.
The energy V will be interfacial, i.e.V (s 0 , g, s) = +∞ unless s(τ, z) ∈ {±s} for almost every (τ, z).We denote by BV ((0, T ) × T d ; {±s}) the set of bounded variation functions that take values in {±s}.The initial and final values s(0, •) and s(T, •) can be understood in the sense of BV trace, and both take values in {±s} (see for example Theorem 5.6 of [20] and consider that the traces at time δ i > 0 converge in L 1 as δ i → + 0 implying that the limit trace will take values in {±s}).Let ∂ * {s = s} ∩ {0 < τ < T } denote the essential boundary of the positive phase region in (0, T ), i.e. the phase interface.On the interface, let ν(τ, z) denote the measure theoretic unit normal pointing from where s = −s to where s = s, i.e. ν = Ds/|Ds| the Radon-Nikodym derivative.
The macroscopic cost V is defined as The cost term V init incorporates the initial condition s 0 .The initial and terminal boundary layer reduces to an optimization on the mixed scale that is microscopic in time and macroscopic in space.
We now proceed to define formally the macroscopic energy in the interior.In this section, we will further characterize the initial and end costs and characterize heuristically the interfacial cost.
Following the general proof of [1], we consider the localized unscaled energy functional, F 1 , defined in (3.13).We define the rescaling R (τ,z),r of s to be (3.17)R (τ,z),r s(t, x) := s(τ + r t, z + r x).Recall that we can decompose F λ to where G λ and N λ are defined in (3.9) and (3.11).Both F λ and G λ satisfy the following scaling identity: Lemma 3.7.For every set A ∈ R d , we set z + r A = {z + r x : x ∈ A}, and we have and for every set B ∈ R d+1 , we set (τ, z) + r B = {(τ, z) + r (t, x) : (t, x) ∈ B}, and we have Proof.We calculate directly Including the time variable, (3.19) follows with the additional factor of r from the time integral.
We now use the unscaled energy F 1 to identify the form of the macroscopic costs by a "cell problem", namely with test functions as periodic functions on the tangential plane of a normal direction ν.We follow nearly the same definitions as [1] for L(ν), but here we have space-time normal ν in contrast to the spatial setting in [1].Another new feature is the addition of a width parameter R ∈ R + in the function class, that corresponds essentially to compactly supported variation from function values of {−s, s}.This compactness is helpful to restrict our arguments to near the interface, e.g., within a distance λ R that will become small as λ → + 0; it was not needed in [1] due to the simpler nature of patching in their problem.We then extend these definitions to also apply to the initial and end times, where we impose the additional initial condition as a constraint and the terminal cost in the energy.
For a unit-normal vector ν in R d+1 we define ν to be the set of all d-dimensional cubes centered at the origin and orthogonal to ν.For ∈ ν , we let × R ν denote the strip × R ν := {(t, x) + ξ ν; (t, x) ∈ , ξ ∈ R}.We say that s : R d+1 → R is -periodic if s((t, x) + r ω) = u(t, x) when r ∈ R is the sidelength of and ω ∈ R d+1 is a unit-normal vector along an axis of .Finally, with R > 0, we introduce the function class Now we define the interfacial energy with normal ν with width R to be LR (ν The assumption that s is C 1 is significant here as discontinuities in the time direction across the cell boundary could result in extra, unaccounted for, energy.We will show later in Lemma 4.12 that the condition s ∈ C 1 can be replaced by a finite energy condition without changing the value of LR . With the above definitions, we finally define the interfacial energy to be L(ν) := lim inf R→+∞ LR (ν).(3.23) The limit exists since LR (ν) is monotone decreasing in R and nonnegative from Proposition 3.5.
Note that, although the definition above makes sense for any s 0 ∈ (−1, 1), in the paper below we will only actually consider the case s 0 ∈ [−s, s] which is easier.
We then define, for g ∈ R (which will later be restricted to |g| ≤ 1 2β Φ(s)), and Note that the limits in (3.27) and (3.31) exist due to monotonicity, although one can easily see that it is +∞ unless s ∈ {−s, s}.
For the remainder of this subsection, we discuss a further characterization of the macroscopic energy terms L(ν), V init (s 0 , s), and V end (s, g).Heuristically, when J is radial and monotonically decreasing in the radial direction, the simplest form of a solution is given by the one-dimensional traveling wave, namely We prove that this is indeed the case for V init and V end where the nonlocal term does not participate.It remains as a conjecture for L.
Theorem 3.8.The macroscopic initial energy is given by the one-dimensional reduction Similarly, the macroscopic terminal energy is given by Proof.The inequality ≤ for both (3.32) and (3.33) is immediate as the one-dimensional solutions may be used in the definition of V init (s , s) and V end (s, g) by extending as constants in space and incur the same cost.
For other direction we consider s ∈ X init R (s 0 , s, ).We may find a regular value for x such that q(t) = s(t, x) satisfies q(0) = s 0 and The inequality ≥ in (3.32) follows as the nonlocal term is nonnegative.Similarly, for (3.33) we consider s ∈ X end R (s, ), and find a regular value of x such that q(t) = s(t, x) and 1 The result follows.
Remark 3.9.The decomposed energy has a symmetry, s(t) = −s(−t), by evenness of W β and Ψ.This interesting observation, not obvious from the original formulation, yields in particular that So long as −s < s 0 < s the solution for V init (s 0 , s) is a time translation of the same 'heteroclinic' solution.When s 0 ≤ −s and s = s the solution for s in the definition of V init , (3.32), does not exist, although the infimum is still well defined.
Remark 3.10.(Controlled front propagation).We may also relate the macroscopic problem to a problem of the optimal control of the propagation front, which has been studied in [7], [8].Consider that the unit normal ν = (ν t , ν x ), and when |ν x | = 0 the front speed may be expressed as c = νt |νx| .We let νx = νx |νx| denote the spatial unit-normal.The anisotropic minimal surface problem for Σ, may now be converted into a problem of controlled front propagation where the cost rate to propagate the front with velocity c with spatial unit-normal νx is given by where clearly ν can be recovered from c and νx as ν = (c, νx )/ √ 1 + c 2 .By an application of Fubini's theorem and the coarea formula, we may express Thus the macroscopic problem is reinterpreted as controlling the wave speed of the evolving front Σ t .This formulation recovers some optimal control structure of the problem.A more rigorous expression of the controlled front problem is given in [8].
A partial result holds for the interfacial energy, reducing the problem to the directions (t, ω νx ) when |ν x | = 0.For a unit-normal e, we let where the integral is taken over the subspace orthogonal to e.Given ξ ∈ R d and a unit-vector e ∈ R d we set ξ ⊥e = ξ − (ξ • e)e.Proposition 3.11.Given a unit vector ν with |ν x | = 0, assume that the Fourier transform FJ(ξ) is maximized at FJ(ξ ⊥νx ).
Then the macroscopic interfacial energy L(ν) is given by the two-dimensional reduction where we limit the dependence of functions in X R ( ) to only (t, x • νx ).
The assumption on FJ is satisfied for instance when J is a Gaussian centered at zero.
Proof.We extend s to R d+1 by zero, and the Plancherel/Parseval theorem and convolution formula states that by our assumption on FJ.
The inverse Fourier transform FJ(ξ ⊥νx ) in all variables is formally δ ⊥νx J(x) where δ ⊥νx is the d − 1 Hausdorff measure on the subspace orthogonal to νx , and Equality above holds when s does not depend on x ⊥νx .The problem for L(ν) is then equivalent to minimizing over s that only depend on t and ω = x • νx .
Conjecture.We suspect that, under the assumptions of Proposition 3.11, the limit cost can be characterized entirely in terms of the travelling wave solutions.Potential lack of regularity and topology of the interface associated with the limiting cost makes it difficult to verify our ansatz.More precisely we conjecture that the interfacial cost L can be characterized using the front speed c = ν t /|ν x | (the ratio of the size of normal in the time and the spatial direction).Indeed we expect that (recall A minimizer of (3.34), would then construct travelling wave solutions of the form s(t, x) = q(c t − νx • x).Remark 3.12.If the conjecture holds, then it follows that as the infinite speed transition is equivalent to the microscopic in time switching from −s to s.

3.5.
Proofs.Here we present some of the longer proofs from earlier in this section that we postponed: the decomposition formula (Proposition 3.2), optimality of the constant solutions (Proposition 3.5), and the improvement of cost for states bounded between the equilibria (Lemma 3.6).
3.5.1.Proof of Proposition 3.2.First we compute the optimal controls, A ± , as a function of V .Changing A ± while preserving the equality only affects the term L(S, A ± ) in the energy so by Lagrange multipliers there is Plugging this back into the constraint ODE we find the quadratic equation so taking the positive root of this equation and then using the constraint These are strictly positive, monotone, and convex in V .Note that and, in particular, Differentiating (3.35) again we note that Now plugging this into the definition of W (S, V ) we see the desired properties of W are a matter of calculus.First, compute and, in particular, For the second derivative we continue computing Using (3.36) and log(A + A − ) = 1 we see that the second term above vanishes and so where we have used (3.35) to get the second equality.This gives the convexity in the V variable and we can go a bit further to make a strict convexity estimate.
Then by the fundamental theorem of calculus We have the formula ∂ V W (S, 0) = 1 2β log 1+S 1−S from above, which we express as ∂ V W (S, 0) = 1 2β Φ (S).For the remainder term we define, as in the statement of the theorem, dZ.
We note that, for 0 Note that the upper bound is true for arbitrary V .While for V ≥ (1 − S)(1 + S) The corresponding upper bound will not be used anywhere so we omit the proof.
Next, we discuss the properties of ∂ V Ψ(S, V ).note that which implies the convexity of Ψ and the concavity of double-well potential.Lastly, we consider the double/single well nature of the potential Note that W β always has a critical point at S = 0 and  Proof.The proposition also follows from Proposition 3.2 as we have Ψ(V, S) ≥ 0 with equality when V = 0, 2 dw dz ≥ 0 with equality when s is constant, and W β (S) ≥ 0 with equality when S ∈ {−s, s}.From the proof of Proposition 3.2 we see that the case S = s corresponds exactly with A = A ± (s, 0), and the case S = −s corresponds exactly with A = A ± (−s, 0).
We first note that for each z ∈ T d , s b+ and s b− are weakly differentiable in time with The local part of the cost separates into the two domains, that is, For the nonlocal part, we will use that for (τ, z) and s b− (τ, z) ≤ s(τ, z), and s b+ (τ, z) ≥ s(τ, z).By nonnegativity of J λ we also have that For (τ, z) ∈ Ω, we have s(τ, z) = s b+ (τ, z) and s b− (τ, z) = s and (dropping the dependence on (τ, z) for ease of notation) Similarly, for (t, x) ∈ Ω c , we have s(t, x) = s b− (t, x) and s b+ (t, x) = s and

This implies that
and, by Proposition 3.5, We conclude Observing that Φ is convex and |g(z)| ≤ 1 2β Φ (s), we have which finishes the proof.

Main result
Our main result is a type of Gamma-convergence, akin to Theorem 1.4 of [1] which addresses nonlocal Allen-Cahn equation.In addition to the previous assumption Assumption (A1) and Assumption (A2) on J we will always assume in this section (A3) (Super-criticality) β Ĵ > 1.Under this assumption W β is a double-well potential with two distinct minimizers ±s.In the critical or subcritical case the asymptotic behavior will be completely different.
In Section 4.1 we show that, in an appropriate sense, the cost C λ asymptotically controls the BV norm and so sequences s λ with bounded cost C λ satisfy appropriate compactness properties.The argument combines ideas for local and nonlocal Allen-Cahn problems in a slightly delicate, but largely standard, way.Note that in this stage we are yet to characterize the macroscopic cost V .
In Section 4.2, we prove several technical "patching" results which are key to the later Γ-convergence arguments.These are quantitative versions of localization results that are naturally needed to ensure that our macroscopic Lagrangian depends locally only on the normal directions of the interface between the state −s and s.More precisely, we show that sequences of test minimizers defined in disjoint domains can be patched along a joint boundary without increasing the energy too much as long as an appropriate notion of trace matches along this joint boundary.The ideas in this section are inspired by [1], but the argument is technically more difficult because the cost functional requires some microscopic regularity in the temporal direction.
Then in Section 4.3 and Section 4.4 we carry out the typical two part Γ-convergence argument.
The argument for the lower bound inequality in Section 4.3 follows a classical general technique introduced by Fonseca and Müller [23]: the problem can be reduced to establishing a pointwise lower bound on the densities for a subsequential limit of the particular test minimizer sequence s λ .In technical terms the patching and compactness lemmas play a key role here.
For the upper bound inequality in Section 4.4 we follow a beautiful idea introduced by Alberti and Bellettini [1] of induction on polyhedral regions.By approximating with polyhedral regions instead of smooth sets, Alberti and Bellettini reduced the entire difficulty of controlling lower order terms related to the "bending" hyperplanes to a relatively simple patching argument where polyhedral test regions meet transversally to the interface.This argument adapts nicely to our setting because we have also established a technique for patching local test minimizers.4.1.Compactness.In this section we show that sequences with bounded G λ are precompact in L 1 and all cluster points are indicator functions of sets of bounded variation.As we have explained, the energy G λ is understood to measure the space-time surface area of the interface between the ±s phases.Thus a BV -like compactness result is to be expected.We note that the estimates we obtain are not uniform as β Ĵ → + 1, , i.e., s → + 0, reflecting the possibility of some more complex phenomena occurring near the critical parameter values.
Of course this type of result is well known for Allen-Cahn [37] and nonlocal Allen-Cahn functionals [1].Our functional is a mix of the two, and with some technical tricks inspired by the two cases we can prove the compactness.
Our first step is to really make a decomposition into a typical local Allen-Cahn type functional measuring the temporal variations, and a nonlocal Allen-Cahn functional measuring the spatial variations: The space-time energy splits analogously where X λ and Y λ are naturally defined as temporal integrals of X λ and Y λ as was done for G λ in (3.13).
The proof is a combination of the compactness arguments for local and nonlocal Allen-Cahn.
First we estimate the time derivative using the bound on Y λ and following a standard argument for local Allen-Cahn functionals.From the Young's inequality, Using above with (3.7) for the set |λ∂ τ ŝ| ≤ 1 and using the Ψ bound with (3.7) for the rest, we arrive at ds so that we have proved (4.1) Next, we use the bound on X λ to obtain a uniform bound for the spatial gradient of (a mollification of) where ϕ is the cut-off function The point of this cut-off is that it is (i) Lipschitz so it doesn't affect the temporal energy Y λ too much, (ii) it simplifies the computation of the nonlocal part of the energy essentially concentrating the energy on the interface without changing the L 1 limit of the sequence (an idea of [1]).
The proof of the bound on |∇ z (φ λ * sλ )| follows closely that of Theorem 3.1 in [1].First, the inequality is established by direct computation with a change of variables argument.The right hand side is decomposed using the set using the fact that sλ ∈ {±s} away from H λ τ .Whereas in H λ τ we simply bound |s λ (τ, z)− sλ (τ, w)| ≤ 2 s.The area of H λ τ can be bounded by a constant times the integral of W β (ŝ), since there is ρ > 0 such that W β (s) ≥ ρ when s ∈ [−s/2, s/2].Along with nonnegativity of the nonlocal term, we have We use this to estimate the error in mollification Using, for the final inequality, the estimates from the previous paragraph.Thus we obtained (4.4) By similar computations Let us now put together above bounds to obtain the global bound.Since W is invertible on [−s/2, s/2], by definition (4.2) it follows that ϕ • W −1 is Lipschitz.So using the previous inequality and (4.1) it follows that By standard BV compactness results there is a subsequence so that φ λ * s → s strongly in L 1 and s ∈ BV ((0, T ) × T d ) with Finally, we show the convergence of ŝλ .Let As in (4.3)we have and, from (4.4) and, since either |ŝ − s| = 0 outside of K λ and otherwise |ŝ − s| ≤ 2, Since δ is arbitrary, the sequence ŝλ is equivalent to φ λ * s in L 1 and thus is relatively compact with all of its cluster points in BV ((0, T ) × T d ; {−s, s}).

Patching lemma.
In this section we develop a technical tool which will be essential in the proof of Γ convergence.Roughly speaking we look for a way to "patch" test minimizers which are defined in disjoint domains to be a test minimizer in the union without increasing the total energy too much.To this end we will need to control the increase of the nonlocal "Dirichlet" energy and the increase of the local "Dirichlet" energy for the cost in the form (3.14).For the nonlocal part of the energy we will follow the ideas in [1] section 2. However, in [1] the actual patching can be done in a straightforward way, simply defining a new, possibly discontinuous, test minimizer piecewise.We cannot do this because the local energy penalizes large time derivatives by the term λ −1 Ψ(u, λ∂ t u) in the energy.Thus the presence of the local energy necessitates a smoother notion of patching.We introduce a notion of "trace" on d − 1-dimensional surfaces, imitating the notions introduced in [1].Define an auxiliary potential Because most of the notions in this section are local we will often work with test spin fields on (τ, z) in subsets of R × R d .
where A τ = {z : (τ, z) ∈ A}.Note that finite energy fields do not necessarily have a true trace on space-like d-dimensional surfaces, the nonlocal energy only gives large scale not micro-scale regularity.The notion of trace error circumvents this technical difficulty.
This leads to a notion of convergence of traces imitating [1][Definition 2 .1].Note that we need to add some additional terms to our notion of trace convergence to deal with the temporal part of the energy.First we define the distance to a space-time surface Σ in the pure temporal variable t((τ, z), Σ) = inf{|σ| : (τ + σ, z) ∈ Σ}.
Definition 4.4.We say that the λ-traces on Σ of a sequence where we note that the trace of u λ on Σ is defined |ν t |H d -almost everywhere on Σ due to the superlinear energy bound on ∂ τ u.Recall from Proposition 3.2 that Ψ 0 (V ) = Ψ(0, V ).
Remark 4.5.Note that the energy bound sup λ G λ (u λ ; A) < +∞ morally is a uniform BV -norm bound and is not enough to show that the traces of u λ on a hypersurface Σ lie in a strongly compact subset of L 1 (Σ, H d ).Also, as mentioned earlier, the energy bound is not sufficient regularity to give a notion of trace for u λ on parts of Σ where |ν t | = 0. Thus, as remarked in [1], the convergence of λ-traces is not so easy to verify for a specific surface.On the other hand, if we take a foliation by Lipschitz hypersurfaces we can get convergence of the λ-traces on almost every surface in the foliation as we show in Lemma 4.6.Lemma 4.6.Suppose that A, Σ, and u λ are as in Definition 4.4 and Let g : A → R be a Lipschitz function with |∇ τ,z g| = 1 a.e. and Σ a be the a-level set of g.Suppose that u λ → u 0 in L 1 (A).Then, along a subsequence, the λ traces of u λ on Σ a relative to A converge to u 0 for a.e. a ∈ R.
Proof.This proof is a slight generalization of [1][Proposition 2.5].We may suppose that u λ and u 0 are defined on R d+1 , extended by 0 away from A. Define (4.6) Each term on the right converges to zero as λ → 0. Note that u 0 (• + λh) − u 0 L 1 (A) → 0 for each fixed h and the integrand is dominated by 2 u 0 L 1 (A) J(h).Since Q λ (a) ≥ 0 it converges to zero in L 1 and so, up to a subsequence, it converges to zero pointwise a.e. a ∈ R.
We will also use another criterion for trace convergence, modified from [1]: Now we move forward to prove a bound on the patching error in terms of the tracial quantities we have defined.First, we recall a definition from [1] which was used for the nonlocal energy control.Definition 4.8.We say Σ strongly divides A − and A + if Σ is the Lipschitz boundary of some set Ω with We recall a localization Lemma of [1].Lemma 4.9.Suppose that A ± are disjoint subsets of R d+1 and are strongly divided by Σ. Suppose further that u λ : Then the discrepancy cost N λ defined in (3.13) satisfies lim sup Proof.In [1], this is proved for each time-slice, and the result is obtained simply by integrating in time and using co-area formula.
The next result shows how to patch test minimizers across a regular (Lipschitz) boundary.As in [1] patching creates extra nonlocal energy due to the nonlocal defect.However now we also have a local term in the energy λ −1 Ψ(u, λ ∂ τ u) which grows superlinearly in the ∂ τ u variable.This means that we cannot simply patch discontinuously, we need to make a regularization at the λ length scale across the patching boundary.The next proposition shows that such a regularization can be made, at an additional energy cost which is controlled by a trace error of the type introduced in Definition 4.4.This result addresses the temporal Dirichlet type energy which is not present in [1].
Then there is ũ : (A ∪ Σ) o → [−s, s] such that ũ = u outside of a λ neighborhood of Σ and for every δ > 0 and for any subregion B ⊂ A Here G λ loc is defined in (3.13) and the constants C depend on Σ.
Proof.We will first assume that Σ is a subset of a single affine d-plane, at the end of the proof we will explain how to extend to the general case of a finite union of affine pieces.
If z is not in the projection of Σ onto T d then we define r * (τ, z) := ∓∞ in A ± .So we have ±(τ − r * (τ, z)) ∈ [0, +∞] in A ± respectively.Also, in the typical style of a priori estimates, we can assume that u is C 1 individually in A ± (but not their union) so that the computations below are justified, but then the estimate obtained will not depend on the C 1 norm so we can remove that assumption in the end.Now we proceed in several steps.
Step 1: First we introduce ũ which essentially averages u in a temporal neighborhood of Σ of size O(λ).This is the exact scale at which we must perform the regularization, smaller scales would have too large temporal "Dirichlet" energy and larger scales would magnify the energy of transitions from −s to s too much.The energy error of the regularization will be related to the trace difference which needs to be traversed over the λ scale.
Let ζ be a cut-off function that satisfies We make a few computations relating û − u and ∂ τ û to the traces on Σ.When ζ(τ, z) > 0 then r * (τ, z) ∈ [τ − λ/2, τ + λ/2].We use this to write, on {ζ > 0}, Then we can use this decomposition in û as well.By definition we have where µ(τ, z) ∈ (0, 1) is defined as the fraction of σ ∈ [−λ/2, λ/2] so that (τ + σ, z) ∈ A + and The appearance of this type of error term motivates the following definition on Note that avg ± (|∂ τ u|) are, respectively, integral averages of ∂ τ u purely on A ± respectively, they do not see the discontinuity across Σ. Recall that we have reduced, for convenience, to the case where Σ is a graph over the t direction and ±(τ − r * (τ, z)) > 0 on A ± .One particular consequence of these computations is that on {ζ > 0}, (4.9) We can also make a similar decomposition for ∂ τ û.Note Step 2. Next we make a general comment on integrals of the type Note that, since Thus we obtain the change of variables formula Step 3. Using above formula, here we will see that error terms of type avg ± (|∂ τ u|) can be controlled by the energy in a λ -temporal neighborhood of Σ.
The following formulae will be applied below with f = |∂ τ u| or other related functions in later steps.We use the change of variables formula, applied to h(τ, x) := f (τ + σ, z), twice to compute Similarly, Combining the above we find where we used (4.10) at the last step with h ≡ 1.
Step 4. In this step, we work to estimate the local terms in the energy, with the first focus on the "Dirichlet" type term.We aim to estimate from above the difference We define As part of estimating the previous energy difference we will estimate Where we used in order, from Proposition 3.2, that Ψ is monotone increasing, odd symmetric, subadditive on [0, ∞), and for the remaining inequalities we used the bounds (3.7) and (3.8).
Applying this we arrive at For the first term on the right we use non-negativity of Ψ The second term can be estimated by the discussion in Step 3, in particular (4.12).It remains to discuss the error term (I), relying on the convexity of Ψ 0 .Observe first that The second term is already one of the claimed error terms in the statement.The first term is bounded by using formula (4.9) to relate with the traces on Σ: The last line using that Ψ 0 (A + B) ≤ C(Ψ 0 (A) + Ψ 0 (B)).It follows by again convexity of Ψ 0 and Jensen's inequality Ψ 0 (λavg ± (|∂ τ u|)) ≤ avg ± (Ψ 0 (λ|∂ τ u|)).
This type of term appears on the right hand side of the claimed estimate so we are done estimating term (I).
Next we deal with the double-well term So we need to deal with this term on the right Applying (4.9) and using that From there the estimate is the same as in Step 3.
Step 5. We still need to bound |N λ (ũ, B, B) − N λ (u, B, B)|.For that we write ũ = u + ζ(û − u) and bound This type of error term was already bounded in step 4 above.
Step 6. Finally we consider the case when Σ is a finite union of pieces Σ 1 , . . ., Σ J which are each contained in affine planes P j , see Figure 6.We can assume that the planes P j are all distinct, otherwise, the corresponding sets could be regrouped with a smaller J.For λ > 0 sufficiently small the λ-neighborhoods of any two parallel planes of {P j } will be disjoint.Any two non-parallel P j meet, at most, on a set of Hausdorff dimension d − 1 and we can bound, using the compactness of the region [0, T ] × T d , With this in mind, we proceed inductively and assume we have constructed ũj satisfying the conclusion of the theorem on Then we apply the single plane case to define ũj+1 by the mollification of ũj .Notice that the traces (ũ j ) ± on Σ j+1 only differ from the traces of u on the intersections of Σ λ k for 1 ≤ k ≤ j with Σ j+1 and by the previous argument these intersections have H d measure bounded by Cλ so which we have already assumed, in the inductive hypothesis, to be bounded by the right-hand side of (4.7).
We now state several useful consequences of Proposition 4.10, restated in terms that are more directly useful for section 4.4.
The primary use of Proposition 4.10 is to patch together two solutions that agree in trace along the dividing boundary.We state as a corollary that this can be done without introducing extra cost in the limit as λ → 0. We use the notation A 1 A 2 to denote the interior of the closure of A 1 ∪ A 2 .
Corollary 4.11.Suppose that a surface Σ, which is a finite union of pieces of d-dimensional affine planes, strongly divides a pair of sets A 1 and A 2 , and u λ 1 : Suppose further that the λ-traces on Σ of u λ 1 converge to v 1 : Σ → [−s, s] and the λ-traces on Σ of u λ 2 converge to v 2 : Σ → [−s, s].Then there exists u λ : Furthermore, the construction of u depends only locally on the values of u 1 and u 2 , and u = u 1 or u = u 2 a distance greater than λ from the boundary of A 1 or A 2 .
Proof.The proof is a direct application of Proposition 4.10 with the function A 2 and u λ = ũ.Definition 4.4 ensures that the right hand side of (4.7) are controlled in the limit as λ → + 0 by C Σ |v 1 − v 2 |dH d and also lim sup applying (4.7) with B respectively to be A 1 ∪ A 2 , A 1 , A 2 .Recall from Definition 4.4 and the assumption of λ-trace convergence on Σ that the last term on the right of (4.7) involving λ −1 Ψ 0 (λ•) also converges to zero as λ → 0. Furthermore, Lemma 4.9 implies that the nonlocality defect N λ (u, A 1 , A 2 ) is bounded by C Σ |v 1 − v 2 |dH d .The L 1 equivalence follows from the fact that u 1 = u and u 2 = u a distance greater than λ from Σ.
We similarly will use an adjustment of functions defined on a square to periodic functions in X R .We fix a space-time unit-normal vector ν and ∈ ν , and consider the step function Recall that we define spaces for -periodic C 1 functions in section 3 ((3.20),(3.24) and (3.28)).Recall also that we consider a sequence λ i > 0 such that λ i → 0 as i → ∞, which we denote as λ → + 0.
(a) Given u λ : × R ν → [−s, s] satisfying the following: sup λ G(u λ ; × R ν) < ∞, the λ traces agree with v ν on the ∂ × R and on × {−R, R}ν, then there is ũλ ∈ X R ( ) such that lim sup the λ-traces agree with v ν on the ∂ × R + and on × {R}ν, the L 1 -trace converges on × {0}ν to a constant s 0 ∈ (−s, s), then there is ũλ ∈ X init R (s 0 , ±s, ) such that lim sup the λ-traces agree with v ν on the ∂ × R + and on × {R}ν, then there is ũλ ∈ X end R (±s, ) such that lim sup s] and A 2 be the union of all the translations along one sidelength of .Then Corollary 4.11 constructs ũλ on A 1 A 2 with lim sup The constructed ũλ is periodic along the translations as the construction of Proposition 4.10 is local, making the adjustment of ũλ on one edge the same as the adjustment of ũλ on the opposite edge.In this way we may consider ũλ defined on all of R d+1 .We now proceed with a mollification of ũλ to ũλ at a scale much smaller than λ.The mollification converges in L 1 and the local time gradient term is lower semicontinuous due to convexity.Furthermore, the λ-trace error does not increase more than order λ, and thus the λ-traces of the mollified sequence converges.By the nonlocal defect estimate of Proposition 4.10, we have that the nonlocal defect vanishes across A 1 and R d+1 \ Ā1 and thus from (3.12) we have lim sup We proceed similarly at the initial and end times.We need to patch on a boundary that is orthogonal to the time direction here.Having extended u to the half space with t ≥ 0, we now also patch with the constant function s 0 on the half space with t < 0. We mollify the sequence at a scale and shift it forward on the λ scale to construct a sequence ũλ that agrees with the constant s 0 at t = 0.The shift also converges in L 1 .
The end time is exactly the same, except that we can simply extend by ũλ (0, •) to times greater than 0.
Then s ∈ BV ((0, T ) × T d ; {s, −s}) and and the energy measure In particular, the total mass of σ λ is bounded above by the total cost.Therefore there is a subsequence and a nonnegative measure σ on [0, T ] × T d such that the σ λ converge in the weak-topology, σ λ σ.We aim to show the following density lower bounds with respect to the interfacial surface measure as well as the initial and end-time surface measures.Call Σ to be the set of points (τ, z) ∈ (0, T ) × T d where the measure theoretic limit of s is not in ±s.Note that by our definition this is does not include any initial or final time points.
(a) On (0, Here ν(τ, z) is the measure theoretic unit normal direction pointing outward to {s = s}, defined To begin the proof of (a), we consider a point (τ 0 , z 0 ) ∈ Σ where the outward unit-normal ν 0 to {s = −s} is well defined.Consider a unit d-cube in the subspace orthogonal to ν 0 , ∈ ν 0 from the definitions of (3.22).Call the d + 1 dimensional unit cube Q = × [−1/2, 1/2]ν 0 with one of the axes oriented in the ν 0 direction and also the rescaled cubes (τ 0 , z 0 ) + r Q that are centered at (τ 0 , z 0 ) with side lengths r.We say points (τ 0 , z 0 ) are regular if the limit exists and the rescaled s, see the notation defined in (3.17), satisfies where v ν is the step function from (4.13).Standard results [19] imply that (4.14) and (4.15) will hold H d | Σ almost everywhere provided that Σ is rectifiable, which holds for the jump set when s is in BV.Similarly, for (b) and (c), these conditions hold at the beginning and end times where Σ is replaced by the slice {0} × T d or {T } × T d , and ν is replaced by the appropriate normal vector (the sign of s(0, z) or negative sign of s(T, z) in the time direction).Now consider a regular point (τ 0 , z 0 ) ∈ Σ as above satisfying (4.14) and (4.15).From weak convergence, we deduce that lim λ→0 σ λ (τ 0 , z 0 ) + r Q = σ (τ 0 , z 0 ) + r Q except for a countable set N of values of r.Furthermore, by (4.14) we have Then we can choose sequences r i and λ i such that By the scaling property of G λ (Lemma 3.7), and by dropping the remainder of the nonlocal term away from (τ 0 , z 0 ) + r i Q, we have By Lemma 4.6 we can choose t ∈ (0, 1) arbitrarily close to 1 so that, up to a subsequence, the λ i -traces of R (τ 0 ,z 0 ),r i ŝλ i on ∂(t Q) converge to v ν 0 in the sense of Definition 4.4.The cost decreases since the new cube is smaller: Using Proposition 4.10 we construct s i patched which "extends" R (τ 0 ,z 0 ),r i s λ i to t × R ν 0 by patching with v ν 0 at distance t/2 away from the tangent hyperplane.Corollary 4.11 and the convergence of the λ We use Proposition 4.12(a) to further replace s i patched by s i periodic , a t periodic function of R d+1 .This does not increase the cost due to the agreement of the λ i trace limits along the boundary of t Since v ν 0 is constant on the components of (t × R ν 0 )\t Q, we have Lemma 4.9 bounds the nonlocal defect for the periodic approximation s i periodic so that Again using the scaling Lemma 3.7, and Proposition 4.12 that allows us to assume that s i periodic is continuously differentiable, we have LR ν 0 ≤ F λ i /r i s i periodic , t × R ν 0 which concludes the proof for (a) after chaining together the inequalities and taking t close to 1.
At the initial time the argument is identical, except that when defining s i periodic we must enforce that s i periodic ∈ X init R (s 0 (z), s(0, z), ), i.e., that s i periodic (0, x) = s 0 (z).This is also done by Proposition 4.12(b) by patching with the constant function s 0 (z) in the domain t < 0 and shifting slightly forward in time so that s i periodic (0, x) = s 0 (z) holds.The rest of the argument goes through exactly working on t × R + ν 0 where ν 0 points forward in time.
At the final time we have (for Q − the intersection of the cube with the lower half plane and ν 1 the unit-vector in the negative time direction) Φ ŝλ i (T, x) dx = G λ i /r i R (τ 0 ,z 0 ),r i s λ i ; Q − + g(z + r i y) ŝλ i (T, z + r y) + 1 2β Φ ŝλ i (T, z + r y) dy.
At points of Lebesgue density of g, we can approximately replace g(z + r i y) in the line above with g(z).As before we construct s i periodic ∈ X end R (s(T, z), ) in Proposition 4.12 (c), making sure to preserve lim inf i→∞ t g(z) R (T,z),r i s(0, x) + 1 2β Φ R (T,z),r i s(0, x) dx − t g(z) s i periodic (0, x) + 1 2β Φ s i periodic (0, x) dx ≤ 0. Proof of Theorem 4.16.The proof is a direct adaptation of [1] until we reach the proof of (c) below, which considers patching recovery sequences in neighboring domains.Since s is polyhedral we can write [0, T ] × T d as a finite union of polyhedral subdomains satisfying the hypotheses of (a) or (b), see Figure 7. (For example do a Voronoi type decomposition, and then add regions of type (a) as necessary to achieve the projection hypothesis in (b).)In particular, even though constants K in (ii) may increase by a finite factor at each union stage, there is no problem since there are only finitely many such unions.
Note that once we have proven (a)-(c), since δ > 0 was arbitrary, by a diagonal argument, we can find the recovery sequence s λ .
Proof of (a): In this case, s is constant equal to either ±s in each connected component of A. In this case, the recovery sequence is trivial s λ = s.The nonlocal and Dirichlet parts of the energy are zero for constants, and the double-well potential is zero on ±s so G λ (s λ ; A) = 0, Proof of (b): We divide into cases depending whether the flat interface Σ is in {0} × T d , {T } × T d or is a face of Jump(s).
First suppose Σ is a face of Jump(s).Let ν be the (constant) inner space-time normal to the affine plane containing Σ. From the definition of L(ν), let ∈ ν and w be an element X R ( ) (defined in (3.20)) with We fix some (τ , z) ∈ Σ and let s λ (τ, z) := w λ −1 (τ − τ ), λ −1 (z − z) .Note that since w ∈ X R ( ) the property (4.16) is satisfied with K = R. Recall that X R ( ) consists of C 1 functions so |∂ t s λ | ≤ Kλ −1 increasing K if necessary.The remainder of the argument is the same as [1], the λ period cells tile most of Σ except for a O(λ)-neighborhood of ∂Σ which has surface measure O(λ) because Σ is polyhedral.
Next suppose Σ is a component of {T } × T d \ (Jump(s) ∪ Jump(g)).We fix some z with (T, z) ∈ Σ, so then s takes a constant value either ±s on A, which we call s(A).Also g takes a constant value on Σ, g(z).Let ν now denote the normal-vector oriented in the negative time direction.From the definition of V end , we choose ∈ ν and w be an element X end R (s(A), ) (defined in (3.28)) with | | −1 F 1 (w; × R + ν) + g(z) w(0, x) + 1 2β Φ w(0, x) dx ≤ V end (s(A), g(z)) + δ.
Let s λ (τ, z) := w λ −1 (τ − T ), λ −1 (z − z) .As before we can conclude the compact support and time derivative bound properties of (4.16) from the properties of the space X end R .Using the projection condition π(A) = Σ and tiling Σ with λ period cells, up to an O(λ)-error from the period cells intersecting ∂Σ as before, we have Finally suppose Σ is a component of {0} × T d \ Jump(s) so again s takes a constant value either ±s on A, call that value s(A).Similarly, s 0 is constant on Σ so we let s 0 (Σ) denote the value.From the definition of V init , let R > 0, ν be oriented in the positive time direction, ∈ ν and w be an element X init R (s 0 (Σ), s(A), ) (defined in (3.24)) with Φ(s 0 (Σ)) ≤ V init (s 0 (Σ), s(Σ)) + δ.
Proof of (c) This is the point where we need new arguments.Essentially the patching procedure of Proposition 4.10 is carried out again here, but with simpler boundary conditions we are able to make more explicit estimates.
Given disjoint sets A 1 , A 2 ∈ A set A := A 1 A 2 and ∆ := ∂A 1 ∩ ∂A 2 .Note that ∆ is contained in a finite union of affine hyperplanes.By assumption, there are sequences s λ j defined, respectively, on A j satisfying hypothesis (ii). Define We need to regularize sλ across the interface ∂A 1 ∩ ∂A 2 at least in the time variable.As long as r ≥ λ this is bounded by Cλ −1 .By hypothesis (ii) we know sλ ≡ s in and so we can conclude s λ ≡ s λ j in A j \ [Γ (K 1 +K 2 +1)λ ∩ ∆ 2r ].The mollification converges uniformly away from the jump set, which also implies convergence in L 1 at the initial time.Call K = K 1 + K 2 .Then we have shown that s λ satisfies (4.16) with the constant K.
Because ∆ and Γ are finite unions of d-dimensional polyhedral sets which meet transversally, We estimate the energy in the overlap region using (4.18).The double-well term is immediate using W β ([−s, s]) ≤ W β (0) The derivative term is estimated using (4.17Finally for the nonlocal cross term we use Lemma 4.9 to find lim sup λ→0 N λ (s λ ; A 1 , A 2 ) = 0.

Figure 1 .
Figure 1.Schematic diagram showing a cross section of s solution in d ≥ 2 (in d = 1 such catenoidal type solution would not occur).Boundary layers at scale λ are displayed around the phase interface and initial and final times.

Figure 2 .
Figure 2. A spatial slice and a time slice of a 2D simulation, with β −1 = 0.9, Ĵ = 1.T = 1.5, where J is a standard Gaussian and λ = 1/40.Numerics were implemented using a simple forward-backward iteration of (3.5) with 80 2 spatial grid points and 300 time grid points.

Figure 4 .
Figure 4.A plot of V init on the interval [−s, s], with β = 0.9 −1 and Ĵ = 1 on the left.On the right, three of one-dimensional solutions for the mean spin field.

Figure 5 .
Figure 5. On the left, a plot of L(c).On the right, a spatial slice of three solutions to the cell problem with different front speeds c.Parameters are β = 0.9 −1 and Ĵ = 1.
we can see again the critical value at β Ĵ = 1.When β Ĵ > 1 the critical point at the origin is a local maximum and there are two local minima at s = 1 − β −2 Ĵ−2 and −s.

Proposition 4 . 10 (
Defect estimate).LetA be an open set, Σ be a finite union of subsets of affine ddimensional planes with a normal direction ν ∈ S d defined H d | Σ -a.e., and u :

Figure 6 .
Figure 6.Open region A intersected by defect surface Σ made up of two affine pieces Σ j , λ-time neighborhoods and their overlap are displayed.

2 Figure 7 .
Figure 7. Example of a polyhedral decomposition so that each subregion is either of the type considered in (a) or in (b).
[23]Proving (i) of Theorem 4.1 by lower-semicontinuity.In this section, we show part (i) of Theorem 4.1, the lower bound inequality of the Γ-convergence: any sequence s λ with bounded cost which converges in L 1 to a limit s has asymptotic cost bounded from below by the effective cost V (s 0 , g, s).For this we follow a now standard idea introduced by Fonseca and Müller[23]: it suffices to show that the H d density of the limiting total variation measure is bounded from below by the respective value of the effective functional.The key technical tool in this argument is the patching estimates Proposition 4.10 and Proposition 4.12, which allow us to patch the local values of s λ into a global periodic test minimizer for the appropriate cell problem.